How I vibe code in production

1. Core Beliefs

LLMs = absurd typing speed (tokens/sec ≫ human)

I treat LLMs as an infinite pair-programmer that never gets tired and can type really fast. With the correct prompt, it can output significant amounts of code quickly—often entire tickets or modules.
LLMs can write production-grade code

Most of what comes out is genuinely solid if you tell it exactly what you want. Tests usually pass on the first try.
Code quality = instruction quality

If the result is poor, 90% of the time it’s a prompt/context problem—not because the LLM can’t write it well.
Unconstrained LLMs drift → enforce strict patterns

If you don’t specify patterns explicitly, you’ll get inconsistent or arbitrary results. The LLM will choose something that likely won’t align with your needs. Be explicit.
Context is king

The LLM only knows what’s publicly available or what you explicitly provide. However, it can effectively understand and utilize detailed context you supply.

2. Prep: Craft the Prompt

Think through what you want; edge-cases, surfaces, and behavior

Before interacting with the LLM, spend significant time figuring out exactly what the feature should do, all states/flows, and failure scenarios. Example: enforcing chronological constraints on “target years” simplified backend logic and UI significantly by preventing bad user states.
Write brutally explicit, constrained instructions

“Make X do Y, using method Z, avoid Q.” This step requires substantial attention and clarity.
Encode model-specific quirks you’ve learned

Different LLMs have specific quirks. Over time, you’ll anticipate these quirks. For example, Claude 4 Sonnet tends to over-engineer if it detects production usage without explicit instructions.
Exploit large contexts

Modern LLMs handle large contexts well (e.g., Claude-4 with a 60k character limit). Provide comprehensive details: API references, code examples, edge-cases, etc.
Leverage ChatGPT (outside Cursor) to transcribe requirements/designs into prompts

For design-heavy or product-driven tasks, ChatGPT 4o or o3 effectively translates requirements, user flows, or design screenshots into prompts. o3 excels at generating pixel-perfect frontend prompts from screenshots. Anecdote: Recently, using this method, Claude generated a nearly pixel-perfect frontend implementation directly from a structured JSON prompt.
(Useful) Dictate ideas via ChatGPT speech‑to‑text

Using speech-to-text helps in capturing spontaneous ideas clearly and quickly. ChatGPT transcribes spoken thoughts directly into structured text, simplifying subsequent editing and refinement.

3. Plan Phase (GPT-4.1)

Ask: “Scan these files, plan implementation – no code.”

GPT-4.1 excels at reading large contexts and creating detailed high-level implementation plans, although it struggles with specific function calls or direct integration tasks.
Review plan → spot design flaws in your own thinking

Reviewing the AI-generated plan can reveal overlooked complexities or flaws in your initial design.
Iterate until plan is obvious + simple

Simplify relentlessly—complexity is typically a signal to revisit and refine your design.

4. Implement Phase (Claude-4 Sonnet)

Prompt: “Implement the plan.”

Claude-4 Sonnet rapidly generates complete implementations, including tests and documentation.
Walk away → grab coffee ☕

Typically, the output is extensive and thorough, significantly reducing manual effort required for initial coding.
Returns with hundreds/thousands of LOC + passing tests

Often uncovers edge-cases or implementation details you hadn’t initially considered.
Usually 90% done; minor tweaks only

Adjustments usually involve minor logic tweaks or test refinements (e.g., adjusting test setups).

5. Cursor Settings

Agent mode only – full control over edit vs add

Explicitly choose when the AI should edit or add code. Other modes don’t provide sufficient precision.
Never use auto model routing → explicitly select the model

Different models have unique strengths and weaknesses. Auto-routing generally leads to less predictable and lower-quality results.

6. Productivity Impact

Coding speed ↑ significantly

Entire features or significant code portions can be completed rapidly, shifting focus to high-value activities.
More time for architecture & review

Time saved translates into deeper architecture discussions, design improvements, and code quality enhancements.
Less grunt typing, more thinking

Shift from typing code to conceptual and architectural thinking.

7. Takeaways

Treat prompt-writing as design, not admin

Prompts are the fundamental product—invest substantial time and thought into crafting them.
Separate Think → Plan → Implement

Each stage serves as a critical checkpoint; maintain discipline in progressing sequentially.
Choose your model explicitly

Know and leverage the strengths and quirks of your tools.
AI coding transforms software production

This approach naturally surfaces weak requirements, simplifies APIs, and enhances overall software usability and quality.