AI-Agent Programming Languages: Why Code Must Evolve for Machine Collaboration

高效码农

5 hours ago

A Programming Language for AI Agents: Why We Need to Rethink Code in the Age of AI

“

When your primary coding collaborator is an AI, the language you use to communicate needs to change.

Last year, I began pondering the future of programming languages in an era where “agentic engineering” is on the rise. My initial assumption was that the colossal mountain of existing code would cement current languages in place forever. I’ve since come to believe the opposite is true.

The way we write software is undergoing a fundamental shift. Our collaborators are no longer just human developers; they increasingly include AI agents capable of writing, reviewing, and modifying code. This partnership demands that we re-examine our most basic tool: the programming language itself. Here’s why I think we’re on the cusp of a new wave of language innovation, and what those future languages might look like.

Why New Programming Languages Will Succeed

It’s obvious that an AI agent performs better in a language it has extensively seen during its training. However, two less obvious factors critically influence an agent’s effectiveness: the quality of the language’s tooling and the rate of change in the language itself.

Take Zig as an example. It appears to be underrepresented in the training data of current large language models (at least in the ones I use), and the language specification is evolving quickly. This combination is suboptimal. While an agent can still program in Zig if provided with precise documentation, the experience is far from seamless.

On the other hand, consider Swift. It’s well-represented in training data, but agents often stumble due to its ecosystem. The tooling required to build a Mac or iOS application can be so complex that agents struggle to navigate it successfully. A rich training corpus alone isn’t a guarantee of success.

This leads to a crucial insight: An agent’s failure or success isn’t predetermined by a language’s age or popularity. What matters is the total friction in the workflow.

The most significant reason new languages can thrive now is the dramatically falling cost of writing code. When coding was expensive, a language needed a vast ecosystem of libraries to be viable. Today, if a language lacks a specific library, you can instruct an agent to port one from another language. The breadth of the ecosystem matters less.

Personally, I now find myself choosing JavaScript (TypeScript) over Python in many scenarios. Not because I prefer it, but because the agents I work with produce more reliable and correct code with it. The ecosystem advantage of Python diminishes when an agent can efficiently bridge functionality gaps.

Therefore, a new language with a strong value proposition can succeed even if it’s initially underrepresented in AI training data. If it’s designed with an understanding of how LLMs learn and operate, and if it offers a significantly smoother experience for human-AI collaboration, developers will adopt it.

The Case for a New Language: Changing Costs and Changing Users

If agents can manage with existing languages, why design a new one? The answer lies in the fundamental changes in who is programming and what the goals of programming are.

Many modern languages were designed under a key constraint: typing is laborious. We traded explicitness for brevity. Widespread type inference is a prime example—you don’t have to write types, making code shorter. The trade-off is that you now need a Language Server Protocol (LSP) or complex compiler messages to understand what type an expression has. Agents find this just as confusing as humans do during code reviews.

While the cost of writing code is plummeting, the cost of understanding it is becoming the bottleneck. We are generating more code than ever, and its clarity is paramount. We might actually want more verbose code if it reduces ambiguity during review and long-term maintenance.

Furthermore, we are moving toward a world where some code is never directly seen by a human. It is generated, reviewed, and maintained by machines. Even in this scenario, we need to explain to a user—who may not be a programmer—what the code does, without requiring them to understand how it does it.

The argument for a new language is this: given the seismic shifts in who writes code, who reads it, and what their costs are, we have a unique opportunity to design a language optimized for this new reality.

What Do AI Agents Prefer in a Language?

Agents don’t have “wants” in a human sense, but we can infer their preferences by observing what makes them efficient or causes them to fail. By measuring metrics like the number of file changes or iterations needed for common tasks, we can identify language characteristics that are agent-friendly.

1. Context Without Needing a Live LSP

The Language Server Protocol is fantastic for IDEs, providing semantic autocomplete and type information. However, it requires a running server with a full view of the codebase. Agents often operate in contexts where running an LSP is impossible or impractical—for example, when analyzing a standalone code snippet from documentation or a single file from a GitHub repository.

A language that offers clear, local context without external tooling is a major advantage. Agents need a unified way to understand code, whether they have a full project or just a fragment.

2. Explicit Delimiters Over Significant Whitespace

As a Python developer, it hurts to admit this, but significant whitespace is problematic for agents. Getting indentation right is token-inefficient for LLMs. When asked to make precise (“surgical”) edits without an assisted tool, they often break indentation, relying on a formatter to clean up later.

On the flip side, languages with dense blocks of closing braces (like Lisp) can also be problematic. Tokenizers can split these sequences in unexpected ways, making it easy for an LLM to lose count. The ideal is a syntax with clear, explicit, and balanced delimiters that are easy to tokenize and track.

3. Explicit “Flow Context”

A powerful concept is the ability to carry implicit context—like a request ID, user authentication, or a timestamp—through a call chain without manually passing it to every function. This is invaluable for observability and debugging.

The challenge is that implicit dependencies can appear unexpectedly. If a function deep in the call stack starts needing the current time, how does that requirement become known to its callers?

One experiment is using explicit effect markers. A function declares what implicit contexts it needs (e.g., needs { time, db }). If a caller doesn’t provide it, a linter warning appears, and an auto-formatter can propagate the annotation upward. This makes dependencies visible and testable.

fn createToken(user: UserId, scopes: []Scope) -> Token
    needs { time, rng } // Explicitly declares dependencies
{
    return Token{
        user: user,
        expires: time.now().add(24h),
        scopes: scopes,
    }
}

test "token has future expiry" {
    using time = time.fixed("2026-02-06T23:00:00Z");
    using rng = rng.deterministic(seed: 1);

    let token = createToken("user123", ["read"]);
    assert(token.expires > time.now());
}

4. Result Types Over Exceptions

Agents are often baffled by exception handling. They tend to wrap code in overly broad catch-all blocks, log the error, and perform poor recovery. This is understandable, as exception paths are poorly documented in most codebases.

Checked exceptions force the issue into the type signature but often create verbose, cascading changes. A more promising direction is rich, typed result objects. However, this requires a type system designed for easy composition of success and error types.

5. Minimal, Stable Diffs and Line-Based Clarity

Agents typically read code line-by-line. This becomes problematic with multi-line strings or literals. An agent might mistakenly edit inside a multi-line string containing embedded code, thinking it’s actual program logic.

Languages that minimize multi-line constructs and encourage a style where the structure is clear from a single line lead to more stable diffs and fewer agent errors. Trailing commas in lists and arrays, for example, prevent unnecessary line changes when adding new items.

6. Greppable Code

One of Go’s unsung virtues is its package system. You use http.Request, not just Request. This fully-qualified naming, while slightly more verbose, makes code trivially searchable with tools like grep. An agent can instantly understand where a symbol comes from without resolving complex import aliases or re-exports.

When code is generated on the fly, this greppability is crucial for large-scale refactoring and analysis scripts. It reduces false positives and makes the codebase more navigable for both machines and humans.

7. Support for Local Reasoning

Agents excel when they can understand a piece of code in isolation. They often work with only a few files loaded in their context window. Language features that require global knowledge—like complex macro expansions, dynamic metaprogramming, or implicit global state—break this model of local reasoning.

A language that keeps dependencies explicit and interfaces clear allows an agent to work effectively on a module without needing the entire program in its head.

8. Dependency-Aware and Deterministic Builds

A major point of failure for agents is unpredictable builds. If an agent can’t reliably determine what needs to be recompiled or retested after a change, it will waste cycles and produce errors.

Languages like Go enforce clear rules: no circular imports, a standard package layout, and cached test results. This creates a deterministic and fast feedback loop for the agent. A language designed for agents would likely enforce similar constraints to make builds perfectly reproducible and incremental.

What Do AI Agents Struggle With?

Conversely, some common language and ecosystem features create disproportionate friction for AI agents.

1. Macros and Complex Metaprogramming

Agents, like many humans, struggle with macros. The original argument for macros was code generation to reduce boilerplate. Now that agents can generate boilerplate easily, the cost of obscure, hard-to-reason-about macros often outweighs the benefit.

Generics and compile-time code execution (like Zig’s comptime) are generally easier for agents because they follow more predictable patterns of substitution.

2. Barrel Files and Re-Exports

“Barrel files” (index files that re-export many modules) destroy greppability and local reasoning. When an agent sees an imported function calculate(), it can’t tell if it comes from ./utils/math.ts or ./vendor/legacy/calculations.ts without tracing through layers of re-exports.

This leads to incorrect imports, wasted context on reading irrelevant files, and general confusion. A simple rule—the import path should point directly to the file of declaration—solves this.

3. Import Aliasing

Aliasing (import { veryLongName as vln } from './module') is a minor convenience for humans but a source of errors for agents. It decouples the symbol name in the code from its source, breaking simple text search. A language should encourage clear naming in the source module rather than aliasing at import time.

4. Flaky Tests and Non-Deterministic Environments

Agents are ironically both victims and perpetrators of flaky tests. They love to use mocking, which, when done poorly, introduces non-determinism. Many languages make it easier to write tests that depend on system time, random numbers, or unmanaged global state than to write isolated, deterministic ones.

A language designed for the AI era would likely build first-class support for dependency injection and deterministic testing primitives, making the “happy path” the path to stable tests.

5. Fragmented and Permissive Toolchains

An agent thrives on clear, binary feedback: “the change works” or “it doesn’t.” Many ecosystems provide muddled feedback. In TypeScript, you can often run code that fails type checking. A bundler might succeed locally but fail in CI due to a subtle configuration difference.

This “gaslights” the agent. The ideal toolchain has one command that checks, compiles, and gives a definitive yes/no answer. Linting failures should be auto-fixable where possible, so the agent can focus on logic, not style.

FAQ: AI Agents and Programming Languages

Q: Will AI agents make programmers obsolete?
A: No. Instead, they are becoming powerful collaborators. The role of the human programmer is shifting from writing every line of code to directing, specifying, reviewing, and designing systems. This makes the clarity of communication—through specifications, tests, and yes, the programming language itself—more important than ever.

Q: Can’t we just improve existing languages for agents?
A: Incremental improvements are possible and already happening (better LSPs, linters). However, some agent-hostile features are deeply embedded in a language’s design philosophy. Achieving optimal human-AI collaboration may require a clean-slate design that isn’t bound by backwards compatibility constraints.

Q: How would you measure if a language is “agent-friendly”?
A: Through objective, quantifiable metrics:

Iteration Count: How many edit-compile-test cycles does an agent need to complete a standard task?
Context Load: How many files must an agent read to understand a single function’s dependencies?
Build Determinism: What percentage of builds are reproducible given the same source code?
Toolchain Uniformity: How many distinct commands are needed to go from a code change to verified correctness?

Q: Wouldn’t a language designed for agents be terrible for humans?
A: Not necessarily. Many principles that help agents—explicitness, local reasoning, greppability, deterministic builds—also benefit human developers, especially in large teams or long-lived projects. The goal is a language that serves as a superior communication medium for both.

Conclusion: The Inevitability of New Languages

We will see new programming languages emerge. The absolute amount of software is growing exponentially, which alone guarantees more language experimentation. More importantly, the economic rationale has changed.

Previously, launching a successful language required building a massive ecosystem—a hurdle few could clear. Now, the initial bar can be lower: create a language that makes AI agents exceptionally productive for a specific domain. If it truly enhances the human-AI partnership, the ecosystem can be built with the help of the very agents that use it.

I hope we see two trends. First, “outsider art”—innovative languages from people unburdened by traditional compiler theory dogma. Second, a more rigorous, data-driven approach to language design. For decades, language debates were dominated by opinion and anecdote. Now, we can measure success by how efficiently an agent navigates the resulting code.

We are entering an era where the facts of machine readability and collaboration efficiency will shape our tools. The programming language of the future won’t be designed just for the human hand typing on a keyboard, but for the hybrid intelligence system of a human mind guiding an AI agent. Designing that language is one of the most exciting software engineering challenges of our time.