How I vibe code in production


1. Core Beliefs

  • LLMs = absurd typing speed (tokens/sec ≫ human)

    I treat LLMs as an infinite pair-programmer that never gets tired and can type really fast. With the correct prompt, it can output significant amounts of code quickly—often entire tickets or modules.

  • LLMs can write production-grade code

    Most of what comes out is genuinely solid if you tell it exactly what you want. Tests usually pass on the first try.

  • Code quality = instruction quality

    If the result is poor, 90% of the time it’s a prompt/context problem—not because the LLM can’t write it well.

  • Unconstrained LLMs drift → enforce strict patterns

    If you don’t specify patterns explicitly, you’ll get inconsistent or arbitrary results. The LLM will choose something that likely won’t align with your needs. Be explicit.

  • Context is king

    The LLM only knows what’s publicly available or what you explicitly provide. However, it can effectively understand and utilize detailed context you supply.


2. Prep: Craft the Prompt

  • Think through what you want; edge-cases, surfaces, and behavior

    Before interacting with the LLM, spend significant time figuring out exactly what the feature should do, all states/flows, and failure scenarios. Example: enforcing chronological constraints on “target years” simplified backend logic and UI significantly by preventing bad user states.

  • Write brutally explicit, constrained instructions

    “Make X do Y, using method Z, avoid Q.” This step requires substantial attention and clarity.

  • Encode model-specific quirks you’ve learned

    Different LLMs have specific quirks. Over time, you’ll anticipate these quirks. For example, Claude 4 Sonnet tends to over-engineer if it detects production usage without explicit instructions.

  • Exploit large contexts

    Modern LLMs handle large contexts well (e.g., Claude-4 with a 60k character limit). Provide comprehensive details: API references, code examples, edge-cases, etc.

  • Leverage ChatGPT (outside Cursor) to transcribe requirements/designs into prompts

    For design-heavy or product-driven tasks, ChatGPT 4o or o3 effectively translates requirements, user flows, or design screenshots into prompts. o3 excels at generating pixel-perfect frontend prompts from screenshots. Anecdote: Recently, using this method, Claude generated a nearly pixel-perfect frontend implementation directly from a structured JSON prompt.

  • (Useful) Dictate ideas via ChatGPT speech‑to‑text

    Using speech-to-text helps in capturing spontaneous ideas clearly and quickly. ChatGPT transcribes spoken thoughts directly into structured text, simplifying subsequent editing and refinement.


3. Plan Phase (GPT-4.1)

  • Ask: “Scan these files, plan implementation – no code.”

    GPT-4.1 excels at reading large contexts and creating detailed high-level implementation plans, although it struggles with specific function calls or direct integration tasks.

  • Review plan → spot design flaws in your own thinking

    Reviewing the AI-generated plan can reveal overlooked complexities or flaws in your initial design.

  • Iterate until plan is obvious + simple

    Simplify relentlessly—complexity is typically a signal to revisit and refine your design.


4. Implement Phase (Claude-4 Sonnet)

  • Prompt: “Implement the plan.”

    Claude-4 Sonnet rapidly generates complete implementations, including tests and documentation.

  • Walk away → grab coffee ☕

    Typically, the output is extensive and thorough, significantly reducing manual effort required for initial coding.

  • Returns with hundreds/thousands of LOC + passing tests

    Often uncovers edge-cases or implementation details you hadn’t initially considered.

  • Usually 90% done; minor tweaks only

    Adjustments usually involve minor logic tweaks or test refinements (e.g., adjusting test setups).


5. Cursor Settings

  • Agent mode only – full control over edit vs add

    Explicitly choose when the AI should edit or add code. Other modes don’t provide sufficient precision.

  • Never use auto model routing → explicitly select the model

    Different models have unique strengths and weaknesses. Auto-routing generally leads to less predictable and lower-quality results.


6. Productivity Impact

  • Coding speed ↑ significantly

    Entire features or significant code portions can be completed rapidly, shifting focus to high-value activities.

  • More time for architecture & review

    Time saved translates into deeper architecture discussions, design improvements, and code quality enhancements.

  • Less grunt typing, more thinking

    Shift from typing code to conceptual and architectural thinking.


7. Takeaways

  1. Treat prompt-writing as design, not admin

    Prompts are the fundamental product—invest substantial time and thought into crafting them.

  2. Separate Think → Plan → Implement

    Each stage serves as a critical checkpoint; maintain discipline in progressing sequentially.

  3. Choose your model explicitly

    Know and leverage the strengths and quirks of your tools.

  4. AI coding transforms software production

    This approach naturally surfaces weak requirements, simplifies APIs, and enhances overall software usability and quality.