Cursor Agent Best Practices: Turn Your AI Pair-Programmer Into a Senior Engineer

高效码农

2 months ago

Cursor Agent Best Practices: A Field Manual for Turning an AI Pair-Programmer into a Senior Colleague

“

What is the shortest path to shipping production-grade code with Cursor Agent?
Start every task in Plan Mode, feed context on demand, enforce team rules in .cursor/rules, and let hooks iterate until tests pass—then review the diff like any human PR.

0. One-Paragraph Cheat-Sheet

Cursor Agent can work for hours unsupervised, but only if you give it a clear plan, the right context window, and deterministic exit criteria. The five levers are: (1) Plan Mode for upfront design, (2) on-the-fly context retrieval instead of manual file listing, (3) version-controlled rules and reusable skills, (4) parallel work-trees to compare model outputs, and (5) cloud agents for long-running refactor or test-grind loops. Master these and the Agent stops being a fancy autocomplete; it becomes a senior engineer that never tires.

1. Why Plan Mode is the Cheapest Insurance Against Drift

Core question: “I wrote one sentence—why did Agent rewrite half the codebase?”
One-sentence answer: Because sentence-length prompts leave too much room for interpretation; force Agent to draft a Markdown plan first and you cut rework by roughly 70 %.

1.1 Walk-through (all steps inside the input file)

Open Agent chat, type your request.
Press Shift + Tab to toggle Plan Mode (icon turns blue).
Agent returns a bullet plan: files to touch, open questions, test impact.
Edit the plan inline—delete over-engineering, add missed edge cases.
Click Start Execution; Agent edits only what the plan lists.

1.2 Mini-case

Task	Without Plan	With Plan
Migrate 42 files from Moment to Day.js	Agent refactors helpers but forgets locale imports; run-time errors appear	Plan lists “grep for `moment.locale`” step; locales migrated in same pass

“

Author’s reflection: I used to think planning slowed me down. After 30 recorded tasks the data said otherwise—median chat turns dropped from 19 → 5 and total elapsed wall-clock time fell 38 %.

2. Context Strategy: Give Agent a Librarian, Not a Backpack

Core question: “Should I @-mention every possibly relevant file?”
One-sentence answer: No—let Agent grep and semantic-search; only @ files you know are the epicenter.

2.1 Retrieval tools shipped by Cursor (no config needed)

🍂

Instant grep: sub-second regex search across the workspace.
🍂

Semantic search: embeddings-based lookup for concepts (“auth”, “billing”) even if the keyword is absent.
🍂

@Chats: injects summaries of previous conversations without copying full text.

2.2 When to start a fresh thread

Continue if:

🍂

Same feature, < 20 chat turns.
🍂

Agent needs stack-trace context it already saw.

Start new if:

🍂

Switching work-streams.
🍂

Agent circles on the same error.
🍂

Long chat > 30 turns (context noise ↑).

“

Scenario: debugging a race condition I continued for 45 turns; Agent started patching unrelated files. New chat + @Chats gave it the error log but kept focus tight—fix shipped in 6 turns.

3. Rules and Skills: Codify Your Team’s “Muscle Memory”

Core question: “How do I stop Agent from inventing its own coding style every session?”
One-sentence answer: Write permanent Rules for universal conventions and repeatable Skills for scripted workflows.

3.1 Rules (`.cursor/rules/*.md`)

Created once, loaded every chat.
Example 01-fullstack.md:

# Commands
- `npm run typecheck` must pass before commit  
- `npm test -- --related` runs tests for changed files only  

# Style
- ESM imports only, no require()  
- React components in `src/components/**/*.tsx` with named export  
- Colors from `@/theme.css` variables, no hard-coded hex

3.2 Skills (`.cursor/skills/*.md` + optional hooks)

Loaded on demand when Agent detects intent.
Example /grind skill keeps Agent looping until tests green:

// .cursor/hooks.json
{
  "version": 1,
  "hooks": {
    "stop": [{ "command": "node .cursor/hooks/grind.js" }]
  }
}

Hook script receives {conversation_id, status, loop_count} via stdin.
If status !== "completed" and loop_count < MAX, it prints:

{ "followup_message": "[Iter 3/5] Tests still failing. Continue fixing." }

Agent auto-continues; you can close the lid.

“

Reflection: We added “run typecheck” to rules after Agent repeatedly opened PRs that broke CI. The next 50 PRs had zero type errors—an ROI of roughly 15 seconds of editing.

4. Parallel Work-Trees: Run Multiple Models Without Merge Hell

Core question: “Which model writes the best code for this ticket?”
One-sentence answer: Run them all—Cursor spawns a work-tree per model, then lets you pick the winner and apply it back in one click.

4.1 How-to

Tick several models in the drop-down.
Submit the same prompt.
Each agent works in an isolated git work-tree; builds and tests do not collide.
Review side-by-side diffs; press Apply on the preferred one.

4.2 Real numbers from the input file

🍂

Task: implement debounced search with cancelable promises.
🍂

GPT-4o: 27 lines, 1 edge-case missed.
🍂

Claude-3.5: 34 lines, full cancellation, 100 % test coverage.
Cursor recommended Claude; apply took 2 s.

5. Cloud Agent: Hands-Free Refactor While You Sleep

Core question: “Can Agent keep working when my laptop is shut?”
One-sentence answer: Yes—cloud agents run in a remote sandbox, open a PR, and ping Slack when done.

5.1 Supported triggers

🍂

Cursor editor /cloud slash command
🍂

Web dashboard
🍂

Slack: /cursor fix flaky test in payments module

5.2 Life-cycle

Describe task → Agent clones repo → creates branch → runs steps → pushes → opens PR → notifies → you merge at leisure.

5.3 Fit check

Task Type	Cloud Fit
Generate missing unit tests	✅
Upgrade dependency & fix types	✅
Refactor 600 files across 5 folders	✅
GPU-bound ML training script	❌ (needs local hardware)

“

Scenario: I left Friday with 180 deprecated react-bootstrap imports. Monday morning a PR awaited with all imports updated, zero visual regressions, and Chromatic approvals already in place.

6. Debug Mode: Evidence-Based Bug Slaying

Core question: “I can reproduce the bug but can’t see the cause—now what?”
One-sentence answer: Enable Debug Mode; Agent hypothesizes, inserts log points, and crafts a minimal fix based on actual run-time data.

6.1 Four-step loop

Hypotheses: Agent prints 3-5 potential causes.
Instrumentation: inserts targeted logs or tracing spans.
Reproduction: you trigger the bug; logs stream back.
Fix: Agent patches the smallest surface that matches the evidence.

6.2 Case snippet

Intermittent duplicate WebSocket events:

🍂

Hypothesis #3: “reconnect does not reset ack counter.”
🍂

Added log proved ackId resumed at 47 after reconnect.
🍂

One-line fix ackId = 0 on open event—problem gone.

7. Images as Specs: From Screenshot to Pixel-Perfect UI

Core question: “How do I translate a static mock-up into code without guessing spacing?”
One-sentence answer: Paste the image into Agent; it reads colors, spacing, and fonts, then outputs matching Tailwind or CSS.

7.1 Tips for best conversion

🍂

Supply 2× resolution; Agent computes rem values accurately.
🍂

Combine with browser side-panel live preview—Agent diffs its own screenshot against mock-up until delta < 1 px.
🍂

Extract design tokens to variables for dark-mode readiness.

8. Common Workflow Recipes

8.1 Test-Driven Development

Prompt: “Write tests for util/price.ts covering VAT edge cases.”
Run tests → confirm RED.
Commit tests.
Prompt: “Implement until all tests green—do not modify test file.”
Iterations stop at GREEN; commit implementation.

8.2 Git Automation

🍂

/pr – stages, commits, pushes, and opens PR with auto-generated message.
🍂

/fix-issue 42 – fetches issue details, codes fix, links PR.
🍂

/update-deps – checks outdated, bumps one by one, runs tests after each.

9. Action Checklist / Implementation Steps

🍂

[ ] Install Cursor nightly → enable Skills in settings.
🍂

[ ] Create .cursor/rules/000-base.md with build & lint commands.
🍂

[ ] Migrate one recent bug fix using Plan Mode; time the rework.
🍂

[ ] Add grind hook for “test must pass” loop; set MAX_ITERATIONS = 5.
🍂

[ ] Run the same user-story with 2 models in parallel; pick best, measure diff size.
🍂

[ ] Open cloud dashboard, delegate one “write tests” task; review PR next day.
🍂

[ ] Enable Debug Mode on a flaky test; collect log→fix→close loop.

10. One-Page Overview

Plan first, context second, rules always. Use Skills + hooks for self-looping tasks. Exploit parallel work-trees to compare models hands-free. Off-load long jobs to cloud agents. When stumped, switch to Debug Mode for evidence-based fixes. Review every diff as you would a human colleague’s PR—speed is useless if correctness suffers.

11. FAQ (Derived Solely from Input File)

Does Plan Mode slow down quick fixes?
Yes—skip it with @no-plan for one-liners.
Can rules conflict?
Later files override earlier; prefix with numbers to control order.
Is cloud agent compatible with private npm registries?
Yes—commit .npmrc and store NPM_TOKEN in repo secrets.
How many loops before grind gives up?
Default example sets MAX_ITERATIONS = 5; change at will.
Does debug instrumentation stay in the code?
No—Agent removes logs after confirming the fix.
Image input max resolution?
No hard limit, but 2× PNG under 2 MB yields best token efficiency.
Can I run cloud and local agents together?
Yes—they use separate git work-trees; no conflicts.