Why Senior Engineers Are Abandoning AI Coding: The Hidden Dangers of Agentic Programming

高效码农

2 months ago

Two Years of Vibecoding: Why I Returned to Writing Code by Hand

Core Question: After relying heavily on AI-assisted coding (Agentic Coding) for a long period, why do senior engineers ultimately decide to return to writing code manually?

In the realm of software development, the journey most people share with AI coding follows a strikingly similar script. Initially, you tentatively assign it a simple task. The results impress you. Emboldened, you give it a massive task. The results leave you even more stunned. This instant gratification easily fosters an illusion that the barriers to programming have been leveled. Immediately following this, you open social media (like X) and draft a passionate manifesto about how “programmers will be replaced.”

However, if you can persist past this initial phase, looking past the early euphoria or fear, then congratulations: you understand AI coding better than 99% of people. The real challenge isn’t whether AI can write code, but what actually happens when we integrate it into serious, production-grade engineering environments.

Image Source: Unsplash

The Inevitable Arc for Serious Engineers

Core Question for this Section: What psychological journey do engineers undergo when they attempt to use AI for actual production work rather than just weekend hobby projects?

For serious engineers using AI to tackle real work—rather than just weekend side projects—the relationship with AI generally follows a predictable development arc.

Still riding the high of that impressive “big task,” you begin to wonder if you can keep feeding it larger and larger challenges. Maybe even that haunting refactor that no one on the team wants to touch?

This is precisely where the curtain starts to crinkle.

On one hand, you are amazed at how well the AI seems to understand you. On the other hand, it makes frustrating errors and decisions that clearly violate the shared understanding you have developed. It feels like working with a talented but erratic intern—sometimes they deliver brilliance, and other times they dig a hole under your foundation.

In this back-and-forth struggle, you quickly realize a fundamental truth: getting angry at the model serves no purpose. Consequently, you begin to internalize any unsatisfactory output.

“It’s me. My prompt sucked. It was under-specified.”

You think to yourself, “If I can specify it, it can build it. The sky’s the limit.”

This line of thinking, while optimistic, opens the door to a frustrating journey of trial and error.

The Trap of Spec-Driven Development

Core Question for this Section: Why does attempting to write Prompts like design documents ultimately fail to solve the deep-seated issues in AI coding?

To solve the problem of “AI not understanding enough,” many engineers (including my past self) chose a path that seemed logically sound: writing extremely detailed specification documents. You open Obsidian or Notion and begin drafting “beefy” specification documents, describing the feature in your head with impressive detail. You might spend half an hour writing a full page of a Prompt, believing that detailed input guarantees perfect output.

However, reality will likely disappoint you: spec-driven development doesn’t work in the AI domain either.

In real-world software engineering, design documents and specifications are living documents. They evolve in a volatile manner through the process of discovery and implementation.

Let’s visualize a specific scenario:

“

Application Scenario Simulation:
Imagine that in a real company, you wrote a design document for a complex architecture in one hour. You then handed this document to a mid-level engineer and explicitly told him: “Do not discuss this document with anyone, just do exactly what it says.” Afterward, you went on vacation, completely abstaining from any follow-up decisions.

What would be the result? In all likelihood, a disaster.

”

This is the exact situation current AI agents face.

Lack of Evolution Ability: An agent does not possess the ability to evolve a specification over a multi-week period as it builds out lower-level components. It cannot modify the design based on feedback discovered during implementation.
Rigidity in Early Decisions: It makes decisions upfront that it later does not deviate from, even if those decisions are proven wrong by later context.
Blind Persistence: Most agents simply surrender once they feel the problem and solution have gotten away from them. Though in modern models, this happens less often—agents will just force themselves through the walls of the maze, stubbornly writing code even when the logic no longer holds.

Hallucinating Slop: The Illusion of Perfect Pull Requests

Core Question for this Section: Why does AI-generated code look fantastic when reviewed in isolation but turn into a disaster when integrated into the whole project?

The most terrifying aspect is this: the code written by AI agents looks plausible and impressive while it is being written and presented to you. It even looks good in Pull Requests (PRs). After all, both you and the agent are well-trained in what a “good” Pull Request looks like.

This deception is potent. If you don’t perform a deep, full-codebase review, it is easy to be fooled by the facade.

It wasn’t until I opened up the full codebase and read its latest state cover to cover that I began to see what we had theorized and hoped was only a diminishing artifact of earlier models: slop.

It was pure, unadulterated slop. I was bewildered. Had I not reviewed every line of code before admitting it? Where did all this… gunk.. come from?

Reflection / Insight:
In retrospect, it made perfect sense. Agents write units of changes that look good in isolation. They are consistent with themselves and your prompt. But respect for the whole, there is not. Respect for structural integrity there is not. Respect even for neighboring patterns there was not.

This is like a puzzle master holding exquisitely carved puzzle pieces, but when assembling them, pays no heed to the picture of the puzzle, forcing them together regardless of the pattern.

“Vibewriting” a Novel: Lack of Contextual Coherence

Core Question for this Section: What exactly are the fatal structural defects present in AI-generated code?

To illustrate this problem vividly, we can compare it to “vibewriting” a novel.

The AI had simply told me a good story. Like vibewriting a novel, the agent showed me a good couple of paragraphs that sure enough made sense and were structurally and syntactically correct. Hell, it even picked up on the idiosyncrasies of the various characters. But for whatever reason, when you read the whole chapter, it’s a mess. It makes no sense in the overall context of the book and the preceding and proceeding chapters.

In the world of code, this means:

Locally Correct, Globally Wrong: The logic inside the function is sound, but the function’s place in the system architecture is completely wrong, or it violates the system’s design principles.
Style Fracture: It can mimic your variable naming, but it cannot understand why you used a specific pattern in Module A (perhaps to solve a specific dependency issue), so when it blindly applies that in Module B, it creates architectural conflicts.
Lack of Evolutionary Perspective: It cannot understand the history of the code—why there is a seemingly strange if check there (that’s to be compatible with legacy data). The AI might simply delete it because it looks like “dead code.”

After reading months of cumulative highly-specified agentic code, I said to myself: I’m not shipping this shit. I’m not gonna charge users for this. And I’m not going to promise users to protect their data with this.

I’m not going to lie to my users with this.

Image Source: Unsplash

Returning to Hand-Crafting: Real Productivity Beyond Token Efficiency

Core Question for this Section: Why is writing code by hand actually faster and more efficient than AI coding when all costs are considered?

So, I’m back to writing by hand for most things.

Amazingly, when you price everything in, and not just code tokens per hour, I’m faster, more accurate, more creative, more productive, and more efficient than AI.

This sounds counter-intuitive, given that AI types faster than any human. But here, “pricing everything in” includes:

Debugging Time: Debugging the structural issues introduced by AI takes far more time than writing a clean architecture from scratch.
Review Costs: To ensure the AI hasn’t introduced “slop,” you have to watch every line of code like a hawk, which consumes no less energy than writing it yourself.
Architectural Refactoring: When AI-written code fails to adapt to new requirements, the cost of scrapping and rewriting is exorbitant.
Psychological Burden: The immense mental pressure of constantly worrying whether the code will collapse in the global context is a massive hidden cost.

When you start calculating these “invisible debts,” the advantages of hand-crafting become apparent. Every keystroke a human makes is not just inputting characters; it is a real-time architectural decision and context verification. This synchronous process of thinking and building is something current AI agents cannot simulate.

Summary and Actionable Checklist

After two years of practice, we can clearly see that while AI coding is powerful, blindly relying on “agents” for complex system architecture and long-term maintenance brings massive technical debt.

Practical Summary / Checklist

Beware the Initial Shock: Don’t assume AI can seamlessly take over complex engineering systems just because it completed a simple “big task.”
Avoid Over-Specifying Prompts: Writing specification documents several pages long often doesn’t help. AI cannot “evolve” documents through communication like human engineers can.
Global Review is Mandatory: Don’t just look at Pull Requests or isolated code snippets. You must open the entire codebase and read the relevant context to check for the AI code’s structural integrity.
Identify “Slop” Code: Be wary of code that is locally perfect but globally fragmented. If the code feels like reading a novel with “incoherent plotlines,” it was likely generated by AI.
Be Honest About Delivery: If you aren’t comfortable using this code to protect user data, or don’t feel it’s worth the price tag, don’t release it. Going back to hand-refactoring is often the more responsible choice.
Redefine Productivity: Don’t measure productivity solely by code generation speed. Factor in debugging, reviewing, and the rework caused by architectural failures.

One-Page Summary

Dimension	AI Agentic Coding	Hand-Crafted Coding
Local Performance	Excellent, perfect syntax, matches prompt	Varies by engineer, but matches mental model
Global Consistency	Poor, lacks respect for overall structure	Good, real-time architectural verification
Documentation Dependency	High, relies on static, detailed specs	Medium, docs are living, evolving with implementation
Error Types	Deep architectural errors, hidden & hard to find	Surface logic errors, easy to locate and fix
Comprehensive Cost	High (including review, refactoring, mental load)	Low (one-time投入, controllable quality)

Frequently Asked Questions (FAQ)

Q1: Is AI coding completely unusable?
A: No. The author’s critique is primarily targeted at “large tasks” and “real production environments.” For simple, isolated tasks that have little impact on the overall architecture, AI can still be helpful, but one should not blindly trust it with complex systems.

Q2: Why doesn’t writing very detailed Prompts work?
A: Because real software development is a dynamic process of discovery. Design documents are living and change as implementation proceeds. AI often solidifies decisions at the initial stage and cannot proactively evolve the spec over a multi-week development cycle.

Q3: What is “slop” in AI code?
A: It refers to code that looks perfect in isolation (like in a PR)—correct syntax and logical flow—but when placed into the context of the entire codebase, it destroys structural integrity, ignores existing patterns, and causes systemic confusion.

Q4: What does “vibecoding” in the article refer to?
A: This is a vivid metaphor comparing AI coding to “vibewriting” a novel. It might produce great paragraphs and dialogue, but when combined into a chapter, it lacks coherence and logic, failing to form a cohesive whole.

Q5: Is writing code by hand really faster than AI?
A: In terms of raw typing speed, no. But if “all factors” are counted—including finding and fixing structural errors introduced by AI, global reviewing, refactoring, and mental burden—hand coding is often more efficient and accurate in complex projects.

Q6: Why does AI fail in complex architectural tasks?
A: Because it lacks “respect for the whole.” It focuses on solving local tasks and cannot understand the codebase’s historical baggage, neighboring module patterns, or the system’s long-term evolution direction.

Q7: What does the author suggest engineers do regarding AI coding?
A: Don’t blindly believe in AI’s “hallucinatory” capabilities. For core functions, user data security, and complex architecture, return to hand-coding to maintain absolute control over code quality, rather than delivering unreliable “slop” to users.