Why Smart AI Founders Are Ditching Fine-Tuning — and Betting on Context Engineering
How a painful startup lesson led one NLP veteran to redefine what “intelligence” really means in the AI age.
1. The Startup That Was Crushed by Its Own Model
Meet Peak, a co-founder of Manus and a veteran with over 10 years of experience in Natural Language Processing (NLP).
A few years ago, Peak launched an ambitious AI startup. Like many others at the time, his team decided to go all in on training their own model. They believed that with enough fine-tuning and computational horsepower, they could build a world-class product.
It didn’t end well.
Every model training cycle took one to two weeks. By the time one version finished training, the product direction had already shifted.
While they were busy optimizing for benchmark scores that users didn’t even care about, the market had already moved on.
“We weren’t killed by competitors,” Peak recalls. “We were killed by our own model.”
The worst part wasn’t the time—it was rigidity.
Once a model is fine-tuned, it becomes locked in its own “action space.”
It’s like forging an exquisite sword to slay dragons, only to wake up the next day and realize the world no longer needs dragon slayers—it needs rocket pilots.
That’s exactly what happened when multimodal platforms like MCP emerged. If they had stuck to their fine-tuned path, their product would’ve become obsolete overnight.
So when Peak founded Manus, he flipped the script:
Instead of tuning the model, he decided to engineer the context.
2. Betting on Context: A Different Kind of Intelligence
What does “context engineering” even mean?
Think of it this way:
Fine-tuning is like teaching a chef to cook a new dish from scratch.
Context engineering, however, is about building the perfect kitchen—where all the ingredients, recipes, and utensils are intelligently organized.
Peak’s insight was simple but profound:
“Big models are the giants’ battlefield. We can’t compete there.
Our moat lies in how well we use those models.”
In other words, the true edge isn’t in the model itself, but in how you feed, frame, and structure its context.
That’s context engineering — the art of helping a model understand the right things, at the right time, in the right way.
3. The Context Paradox: When More Becomes Less
Here’s the twist: giving AI more context can actually make it worse.
LangChain’s founder, Lance, described this as the context paradox:
“The more context an agent has, the less intelligent it becomes.”
Why? Because complex tasks require dozens of tool calls and long conversational histories. But as the context grows—say, up to hundreds of thousands of tokens—the model begins to slow down, repeat itself, and lose coherence.
Peak’s team discovered that even models with million-token windows start to rot at around 200K tokens—a phenomenon they call context rot.
Your AI isn’t dumb; it’s just drowning in its own memory.
4. The Four Pillars of Context Engineering
So how do you fix this?
Manus and LangChain identified four engineering pillars that top teams rely on:
1️⃣ Context Offloading
Don’t shove everything into the prompt.
Instead of pasting an entire 10,000-word search result, simply store the file path. The agent can retrieve it on demand.
→ Think “bring your notes,” not “memorize the textbook.”
2️⃣ Context Retrieval
Store long-term information externally (in vector databases or memory stores) and fetch it only when needed.
→ This gives your AI long-term memory.
3️⃣ Context Isolation
Break complex tasks into smaller subtasks, each handled by a separate agent with its own mini-context.
→ Like dividing a project among specialists who don’t get in each other’s way.
4️⃣ Context Reduction
The secret weapon: proactively trimming context before it rots.
And this is where Manus shines.
5. The Art of Compression and Summarization
Peak’s team perfected a two-step method—Compaction and Summarization.
Compaction is reversible trimming.
It removes data that can be reconstructed later (like replacing {content: "..."} with {path: "file.txt"}).
→ No information lost, just externalized.
Summarization is irreversible forgetting.
It condenses past interactions into clean summaries, freeing up space at the cost of detail.
Manus’ strategy is brilliantly balanced:
-
Set a “rot threshold” (e.g., 128K tokens). -
When reached, compress first, summarize later. -
Only compress the oldest 50% of history, keeping the newest examples fully intact. -
When summarizing, always use the original data for accuracy, and retain the latest few actions for continuity.
In short, they taught AI how to forget just enough to stay smart.
6. Building on Shifting Sands — and Rebuilding Five Times
“From March to now, we’ve rebuilt the system five times,” Peak laughs.
Why? Because the AI world changes faster than any codebase can keep up.
OpenAI, Anthropic, and others keep redefining the model frontier.
But Manus remains stable—not because of what model they use, but because of how their context architecture is built.
Peak even makes a counterintuitive claim:
“Using open-source models can actually be more expensive.”
That’s because in Agent-based systems, input dominates output. The true cost lies in KV caching, not compute.
Major API providers like Anthropic have heavily optimized distributed KV caching—making their hosted models cheaper at scale than self-hosting open models.
Sometimes, buying from the “supermarket” is cheaper than growing your own crops.
7. The Philosophy: Build Less, Understand More
After two startups and countless rebuilds, Peak distilled one ultimate truth:
“Our biggest breakthrough didn’t come from adding new tricks—
it came from simplifying, removing layers, and helping the model do less but understand more.”
That’s the essence of context engineering.
It’s not about making your system smarter.
It’s about making the model’s job easier.
💡 The Takeaway
For AI developers:
Before you rush to fine-tune, ask yourself — is your context engineered well enough?
For everyone else:
AI’s intelligence isn’t just about remembering more;
it’s about knowing what to forget.
The next AI revolution won’t be about bigger models or more GPUs.
It’ll be about making machines understand humans—
not by learning harder, but by listening smarter.
Final Thought:
In a world obsessed with “adding,” Manus chose to “subtract.”
While others built taller towers of parameters, they focused on building deeper layers of understanding.
And maybe, just maybe—
that’s what real intelligence looks like.
