AI-Assisted Development for Scientific Workflows
Scope Clarification: This guide teaches vibe coding—using AI tools (Claude, Cursor AI) to build applications. It is not about building LLM-focused applications or agentic AI systems. That is a separate, more advanced topic requiring different approaches and infrastructure. Note: This guide reflects my personal approach to vibe coding on a MacBook Pro M4. The principles apply universally, but you're welcome to use different tools, platforms, or technology stacks that suit your workflow.
Vibe coding is an AI-assisted development approach where you orchestrate two specialized tools: Claude for strategic thinking and architecture, and Cursor AI for code implementation. This isn't about writing less code—it's about having AI partners handle implementation details while you focus on scientific logic and workflow design.
The honest metaphor: Vibe coding builds functional applications quickly, but they're constructed on rocky foundations—just like the house on the left. Everything works, the rooms are livable, but the structural integrity beneath isn't engineered for permanence.
The house on the right? That's engineer-guided AI development—proper foundation, load-bearing walls, engineered for longevity. It takes longer to build, requires more expertise, but it's designed to last and scale.
For scientists: If you're building research tools for personal use or small teams, the rocky foundation is often perfectly acceptable. The house stands, it serves its purpose, and you can always rebuild with better foundations later if needed. Just be aware of what you're building on, and don't be surprised when production deployment requires more engineering rigor than vibe coding provides.
Vibe coding is deliberate scaffolding. It's designed to get scientists building functional applications quickly, but it's also the foundation for more sophisticated AI engineering practices. Think of it as training wheels that teach you the terrain before you need to navigate it independently.
AI Engineering is the discipline of building production-grade systems that leverage large language models and other AI capabilities. Unlike vibe coding, which focuses on rapid prototyping with heavy AI assistance, AI engineering emphasizes:
Reliability, monitoring, error handling, logging, deployment pipelines
Token optimization, caching strategies, model selection, budget tracking
Systematic prompt testing, output validation, regression testing, performance benchmarks
RAG implementations, agent frameworks, multi-model orchestration, state management
| Aspect | Vibe Coding | AI Engineering |
|---|---|---|
| Primary Goal | Get working prototype quickly | Build production-ready systems |
| AI Role | Generate most of the code | Assist with specific components |
| Code Understanding | High-level comprehension acceptable | Deep understanding required |
| Testing Approach | Manual testing, "does it work?" | Automated tests, evaluation frameworks |
| Error Handling | Basic try/catch, ask AI to fix | Comprehensive error strategies, logging |
| Prompt Management | Ad-hoc, conversational | Versioned, tested, optimized |
| Cost Awareness | Subscription-based, limited tracking | Per-token budgets, optimization strategies |
| Debugging | Copy errors to AI, iterate | Systematic debugging, root cause analysis |
| Documentation | AI-generated, often outdated | Actively maintained, up-to-date |
| Deployment | Local or simple hosting | CI/CD pipelines, monitoring |
| Time to First Version | Hours to days | Days to weeks |
| Scalability | Single user, prototype scale | Multi-user, production scale |
Vibe coding isn't wasted time. These skills and knowledge directly transfer to AI engineering:
You've learned what LLMs can and cannot do reliably through direct experience
Clear communication with LLMs, context management, specification clarity
FastAPI structures, data flow, API design from generated applications
Pattern recognition for common errors, understanding stack traces
Package management, virtual environments, common libraries
Your research problem understanding—this doesn't change
Honest assessment: Most scientists don't need to become AI engineers. Vibe coding is sufficient for research tools, internal applications, and proof-of-concept work. Consider the transition when:
Your tool needs to serve colleagues or external users reliably
Running hundreds or thousands of times per day with cost implications
Failures would significantly impact research outcomes or timelines
Need to publish methods or share with the scientific community
Multi-step agent systems, RAG implementations, or custom fine-tuning
Moving toward computational roles or building AI-powered products
Start with vibe coding. Build working tools for your research. Understand what's possible and what's difficult. Then, if your needs grow beyond prototypes, you'll have the foundation to either learn AI engineering yourself or effectively collaborate with engineers who can take your vision to production.
Stuck on a setup step? Encountering errors? Not sure what a command does? Ask Claude for help!
python3 -m venv venv. What does this mean?"pip and pip3?"uvicorn does in step 5?"Claude can provide platform-specific guidance, debug errors, explain technical terms, and walk you through setup challenges step-by-step. This is exactly the kind of assistance Claude excels at—don't hesitate to use it!
Step 1: Open Claude and describe your scientific problem in a new chat within a Project
Step 2: Request architecture and file specifications
Step 3: Export specifications as Markdown
Step 4: Open Cursor, create project structure, import specifications
Step 5: Use Cursor to generate initial code with Sonnet 4.5
Step 6: Test locally, debug with Cursor
Step 7: Escalate conceptual issues back to Claude
This is a real vibe coding example. The Confluence Analyzer application was built using exactly this workflow. Follow these steps to replicate the process and build your own version.
Expected Time: 2-4 hours for complete implementation | Difficulty: Intermediate | Technologies: FastAPI, HTMX, Tailwind CSS, scikit-image
This is YOUR step. You need to describe your scientific application in a way that Claude can understand and design. The quality of your initial prompt directly impacts the quality of Claude's architectural output.
What does your app need to do?
Why does this exist?
What are you handling manually?
Create a [platform] [language] application that implements [core function] for [scientific domain].
Here is the method/protocol to follow:
[URL to reference methodology, paper, or protocol]
[Optional: additional context or requirements]
[Optional: Example data or inputs]
Notes:
- I will handle [manual step 1] myself
- [Platform-specific constraint]
- [Any other limitations or preferences]
This is what was actually used for the Confluence Analyzer:
create a local Py app that implements Confluence Assessment for cells in life sciences.
Here is the method to follow:
https://www.thermofisher.com/blog/life-in-the-lab/how-to-measure-cell-confluency/
Notes:
- I will setup and activate a venv, and install requirement.txt manually.
Why this worked:
Your Task: Write Your Own Prompt
Before moving to Step 2, craft your initial prompt following the template above. Think about your scientific problem, find reference methodology, and be honest about constraints.
These are the types of images the Confluence Analyzer application needs to process:
Four microscopy images showing cells at varying confluence levels (approximately 10%, 40%, 60%, and 90% coverage)
Interactive HTML viewer showing main window, settings panel, batch processing interface
SVG showing 7-step processing pipeline with timing estimates
Technology stack recommendations and system architecture
Critical Orchestration Step: The mockups and diagrams from Claude were saved as files, then referenced in the Cursor AI prompt.
This is the manual file management that makes vibe coding work—you move specifications from Claude to Cursor.
create a local Py app that implements Confluence Assessment for cells in life sciences.
Here is the method to follow:
https://www.thermofisher.com/blog/life-in-the-lab/how-to-measure-cell-confluency/
/Users/raminderpal/.../mockups_viewer.html
/Users/raminderpal/.../05_data_flow_diagram.svg
Here are example input images:
/Users/raminderpal/.../example_images.png
Notes:
- I will setup and activate a venv, and install requirement.txt manually.
Referenced: example_images.png
Key Differences from Claude Prompt: Same core request, but now includes absolute file paths to Claude's outputs AND the example images. Cursor uses all of these to generate implementation code that matches the design specifications and handles the expected input format.
confluence-assessment/
├── src/
│ ├── main.py
│ ├── config.py
│ ├── image_processing.py
│ ├── batch_processor.py
│ └── models.py
├── templates/
│ ├── index.html
│ ├── results.html
│ └── batch.html
├── uploads/
├── outputs/
└── requirements.txt
Cursor also generated comprehensive documentation including installation steps, usage instructions, API endpoints, and troubleshooting guides. This wasn't requested—it followed best practices automatically.
python3 -m venv venvExpected total development time: 2-4 hours from prompt to working application
This guide extends vibe coding to machine learning and drug discovery workflows. The core principle: separate strategic thinking from implementation using Claude Opus for reasoning and Cursor Sonnet for building.
Purpose: High-level reasoning and scientific strategy
"Design a feature engineering strategy for small molecule binding affinity."
Why Opus?
Output: Text specification or detailed blueprint
Purpose: Implementation and code generation
"Here is the spec for the feature engineering class. Write the PyTorch implementation."
Why Sonnet?
Output: Production-ready code
Scenario: You need to build a new graph neural network (GNN) layer for molecular property prediction.
Time: 5-10 minutes
I am designing a GNN for drug discovery. I need to engineer edge features based on bond type and stereochemistry. Review standard literature (like RDKit conventions) and propose a robust JSON schema for these features. Warn me about potential edge cases with disconnected graphs.
What to expect: Detailed analytical plan, JSON/text specification, edge case warnings, literature references
Action: Copy the summary and key specifications into a text file
Create a new file in Cursor (e.g., specs/edge_features_spec.md)
Why this matters: You are feeding the "smart" reasoning into the "fast" coder. This grounds Cursor and prevents naive implementation mistakes.
# Edge Features Specification ## Feature Schema [JSON or structured description] ## Constraints - No null values in bond types - Must handle disconnected atoms - Stereochemistry: must preserve 3D orientation ## Edge Cases - Aromatic bonds: treat as hybrid sp2 - Implicit hydrogens: RDKit convention
Tool: Cursor Composer (Cmd+I / Ctrl+I) | Time: 5-15 minutes
Model verification: Go to Cursor Settings → Models → Confirm "Claude 3.5 Sonnet" (NOT cursor-small)
Read `specs/edge_features_spec.md`. Create a new file `features/bond_processing.py`. Implement the `BondFeaturizer` class described in the spec. Use `rdkit` and `torch_geometric`. Ensure strict typing.
Expected: 90%+ correct first attempt (following high-quality blueprint)
Tool: Cursor Chat (Cmd+K / Ctrl+K) | Time: 5-10 minutes (iterative)
Run the generated code, highlight problematic blocks, ask for specific fixes:
Each refinement cycle: 2-3 prompts until convergence
Time: 5 minutes
Integrate the BondFeaturizer into `data/molecule_loader.py`. Update the `__getitem__` method to use the new edge features. Also update `config/train_config.yaml` with new feature dimensions.
| Task | Best Tool/Model | Why? |
|---|---|---|
| Scientific Strategy | Claude App (Opus) | Deepest reasoning; catches scientific logic errors |
| System Architecture | Claude App (Opus) | Better at "big picture" component design |
| Writing Code | Cursor (Sonnet 3.5) | Faster, fewer syntax errors, better file management |
| Refactoring | Cursor (Sonnet 3.5) | Composer feature unmatched for multi-file edits |
| Quick Fixes | Cursor (Sonnet 3.5) | Instant context awareness of your repo |
| Algorithm Debugging | Cursor Chat (Sonnet 3.5) | Understands your entire codebase |
| New Module Design | Claude App (Opus) | Handles abstract design better than Sonnet |
Opus = "What and Why" | Sonnet = "How and When"
Write down Opus reasoning before asking Sonnet to code. Reduces iteration cycles by 50%+
Use Cmd+I to edit multiple files at once. Perfect for features that span multiple modules
Sonnet is "too agreeable"—check for data leakage, silent column drops, assumption mismatches
Each cycle takes 5-10 minutes. 4 cycles = polished, production-ready module
For a typical ML feature engineering module (300-500 lines):
Vercel offers zero-configuration deployment for static sites and serverless functions. It's ideal for getting vibe-coded prototypes online quickly. This guide focuses on practical deployment, not production engineering—these are demos and prototypes, not enterprise applications.
Single-page apps, documentation, visualizations, interactive demos
Simple REST endpoints, data processing, file conversions (max 10s execution)
Get a live URL in minutes to share with collaborators or supervisors
100GB bandwidth, unlimited deployments, automatic HTTPS, custom domains
Serverless functions timeout at 10s (hobby) or 60s (pro). No overnight model training.
Blob storage free tier: 1GB total. Not for datasets, images, or user uploads at scale.
Every request gets a fresh container. No persistent in-memory state.
250MB uncompressed limit (50MB compressed). PyTorch, TensorFlow, full scipy stacks often exceed this with all dependencies.
Vercel Blob is an S3-compatible object storage for files. It's useful for storing user uploads, generated images, or processed data files.
from vercel_blob import put, get, list_blobs
# Upload file
blob = await put('data.csv', file_content, token=BLOB_TOKEN)
# Returns: {'url': 'https://...', 'pathname': 'data.csv'}
# Download file
data = await get('data.csv', token=BLOB_TOKEN)
# List all files
files = await list_blobs(token=BLOB_TOKEN)
Reality: For scientific prototypes, 1GB is enough for demo purposes but not for production datasets. Use external storage (S3, Dropbox, Google Drive) for larger files.
Important limitation: The Confluence Analyzer uses FastAPI with image processing (scikit-image, NumPy). While individual libraries might seem manageable, their transitive dependencies (scipy, joblib, etc.) can push the total beyond Vercel's 250MB uncompressed serverless limit. This walkthrough demonstrates the process, but you'd likely need to either: (a) significantly simplify dependencies, (b) use Vercel Edge Functions with WebAssembly, or (c) deploy the full app elsewhere (Railway, Render, Fly.io).
For demonstration, we'll deploy a simplified version: just the static HTML interface without the Python backend. Users can upload images, but analysis would require a separate backend deployment.
Create a simple project structure:
confluence-demo/ ├── index.html # Your main HTML file ├── assets/ │ ├── logo.png # Any images │ └── styles.css # Optional CSS └── vercel.json # Optional config
cd confluence-demo git init git add . git commit -m "Initial commit" # Create GitHub repo and push git remote add origin https://github.com/yourusername/confluence-demo.git git push -u origin main
Option A: Via Vercel Dashboard (Easiest)
https://confluence-demo.vercel.appOption B: Via Vercel CLI
# Install Vercel CLI npm install -g vercel # Login vercel login # Deploy (from project directory) vercel # Follow prompts: # - Link to existing project? No # - Project name? confluence-demo # - Directory to deploy? ./ # - Auto-detected settings okay? Yes # Deploy to production vercel --prod
Every push to your GitHub main branch triggers automatic deployment:
# Make changes echo "Updated content" >> index.html # Commit and push git add . git commit -m "Update content" git push # Vercel automatically deploys within 30-60 seconds # Preview URL: https://confluence-demo-git-main.vercel.app # Production URL: https://confluence-demo.vercel.app
For FastAPI apps like the Confluence Analyzer, you'll need to adapt for serverless constraints:
api/ ├── index.py # Main FastAPI app └── requirements.txt # Python dependencies public/ └── index.html # Static frontend vercel.json # Deployment config
{
"builds": [
{
"src": "api/index.py",
"use": "@vercel/python"
}
],
"routes": [
{
"src": "/api/(.*)",
"dest": "api/index.py"
}
]
}
Reality check: The Confluence Analyzer requires scikit-image (~40MB) + NumPy (~15MB) + Pillow (~3MB) = ~58MB for the core libraries, which fits within the 250MB uncompressed limit. However, scikit-image pulls in scipy (~100MB), joblib, and other transitive dependencies that can push the total deployment size to 200MB+.
If your deployment approaches the 250MB limit, use the two-brain approach to optimize:
Prompt example:
"My FastAPI app with scikit-image is ~200MB uncompressed. I need to deploy to Vercel (250MB limit). Analyze my requirements.txt and suggest: (1) lighter alternatives for image processing, (2) dependencies I can remove, (3) architecture changes to reduce bundle size."
Claude will identify heavy dependencies, suggest alternatives (e.g., PIL instead of scikit-image for basic operations), and propose architectural changes.
Take Claude's recommendations to Cursor:
"Refactor the image processing module to use Pillow instead of scikit-image. Replace the gaussian_filter function with PIL's ImageFilter.GaussianBlur. Update requirements.txt and all affected functions."
Cursor will handle the refactoring, update imports, and modify your requirements.txt file.
After optimization, check your deployment size locally:
# Install dependencies and check size pip install -r requirements.txt --target ./package du -sh ./package # Compare before and after
This iterative approach often reduces deployment size by 40-60% while maintaining functionality.
# Variables (dynamic typing)
name = "experiment_01"
temperature = 37.5
is_valid = True
samples = [1, 2, 3, 4, 5]
# Type checking
type(temperature) # <class 'float'>
# If statements
if temperature > 35:
print("High temperature")
elif temperature > 25:
print("Normal temperature")
else:
print("Low temperature")
# For loops
for sample in samples:
print(f"Processing sample {sample}")
# While loops
count = 0
while count < 5:
count += 1
# Basic function
def calculate_average(values):
return sum(values) / len(values)
# With type hints
def process_data(data: list[float]) -> float:
"""Process data and return average."""
return sum(data) / len(data)
# Lambda functions
square = lambda x: x ** 2
# Creating and manipulating lists
samples = [1, 2, 3, 4, 5]
samples.append(6)
samples.extend([7, 8])
first = samples[0]
last = samples[-1]
subset = samples[1:4] # Slicing
# List comprehension
squared = [x**2 for x in samples]
filtered = [x for x in samples if x > 3]
# Creating and using dictionaries
experiment = {
"name": "test_01",
"temperature": 37.5,
"duration": 120
}
# Accessing values
temp = experiment["temperature"]
temp = experiment.get("temperature", 25) # With default
# Iterating
for key, value in experiment.items():
print(f"{key}: {value}")
import numpy as np
# Creating arrays
arr = np.array([1, 2, 3, 4, 5])
zeros = np.zeros((3, 3))
ones = np.ones((2, 4))
random = np.random.rand(5)
# Operations
mean = arr.mean()
std = arr.std()
normalized = (arr - arr.mean()) / arr.std()
# Text files
with open('data.txt', 'r') as f:
content = f.read()
lines = f.readlines()
# CSV files with pandas
import pandas as pd
df = pd.read_csv('data.csv')
# JSON files
import json
with open('config.json', 'r') as f:
config = json.load(f)
# Text files
with open('output.txt', 'w') as f:
f.write('Results\n')
f.writelines(['line1\n', 'line2\n'])
# CSV with pandas
df.to_csv('output.csv', index=False)
# JSON
data = {"experiment": "test_01", "result": 42}
with open('results.json', 'w') as f:
json.dump(data, f, indent=2)
from pathlib import Path
# Modern path handling
data_dir = Path('data')
file_path = data_dir / 'experiment.csv'
# Check existence
if file_path.exists():
print("File found")
# Create directories
output_dir = Path('output')
output_dir.mkdir(exist_ok=True)
# Create virtual environment
python -m venv venv
# Activate (macOS/Linux)
source venv/bin/activate
# Activate (Windows)
venv\Scripts\activate
# Deactivate
deactivate
# Install package
pip install numpy
# Install from requirements
pip install -r requirements.txt
# Create requirements file
pip freeze > requirements.txt
# Install specific version
pip install numpy==1.24.0
# .env file
API_KEY=your_api_key_here
DATABASE_URL=postgresql://localhost/db
# Loading in Python
from dotenv import load_dotenv
import os
load_dotenv()
api_key = os.getenv('API_KEY')
project/
├── venv/ # Virtual environment
├── src/ # Source code
│ ├── __init__.py
│ ├── main.py
│ └── utils.py
├── tests/ # Test files
│ └── test_main.py
├── data/ # Data files
├── .env # Environment variables
├── .gitignore
├── requirements.txt
└── README.md