Anthropic Core Insight: Stop Building Redundant AI Agents, Focus on Crafting Skills for Specialized Capabilities

In the research, development and implementation of AI Agents, many teams have fallen into a common misconception: building a dedicated Agent from scratch for every business scenario, paired with independent toolchains and scaffolding. However, in an internal sharing session held three months ago, Barry and Mahesh from Anthropic put forward a groundbreaking core viewpoint — instead of creating new Agents for each scenario, it’s far more effective to equip existing general-purpose Agents with “Skills”. This perspective is not a baseless assertion; it is rooted in the practical experience of Claude Code, real-world data from the Skills ecosystem in its first five weeks of launch, and in-depth predictions about the ultimate evolution of AI Agent architecture. Today, we’ll break down the logic behind this viewpoint and explore how Skills are redefining the way we build an Agent’s capabilities.

I. Lessons from Claude Code Practice: General AI Agents Don’t Equal Specialized Capabilities

For a long time, the prevailing understanding of AI Agents among R&D teams has been this: Agents for different fields must be “structurally distinct”, and each one needs a tailored toolchain and scaffolding — after all, the requirements of finance, scientific research, office work and other scenarios vary drastically. The Anthropic team held this exact assumption during the development of Claude Code, yet hands-on practice completely overturned this cognition.

1. Initial Assumptions vs. Real-World Results

Claude Code is essentially a general-purpose AI Agent. When handling typical tasks such as “generating financial reports”, its core operational logic is highly unified: call APIs to pull data → organize data via the file system → run Python scripts for analysis → output standardized documents. The underlying support for the entire process only requires bash and a file system, meaning the core scaffolding can be extremely streamlined. This proves that Agents for different domains do not need entirely independent underlying architectures; the basic framework of a general-purpose Agent is sufficient to support multi-scenario requirements.

2. The Core Analogy: A Genius with High IQ ≠ A Professional Expert

To illustrate the core shortcoming of general-purpose Agents, the Anthropic team used a vivid analogy: when you need to file your taxes, would you choose a mathematical genius with an IQ of 300, or an experienced tax expert? The answer is obviously the latter. No one would expect the high-IQ genius to deduce the 2025 tax laws from first principles — even if he has the ability to do so, it would take a tremendous amount of time and fail to guarantee compliance with practical professional norms.

Most AI Agents available today are just like this “high-IQ genius”: they possess extremely strong general reasoning abilities, and can produce impressive results with sufficient guidance, yet they suffer from prominent core pain points:

Lack of critical contextual knowledge in specific fields, making it difficult to directly meet the needs of professional scenarios;
Inability to learn and reuse experience from historical tasks, leading to the widespread problem of reinventing the wheel.

3. Key Pain Points of Current AI Agents

In summary, Agents that rely solely on general capabilities face two core challenges in real-world implementation:

「Lack of professional experience」: No matter how strong their logical reasoning ability is, they cannot replace the practical experience accumulated over a long period in a specific field;
「No continuous learning capability」: After completing a task, the relevant procedural knowledge cannot be retained, and the Agent still has to re-derive solutions when encountering similar problems next time.

This is the core reason why Anthropic shifted its focus to building Skills — to fill the gap of “professional experience” for general-purpose Agents.

II. The Minimalist Design of Skills: Essentially a Reusable “Folder of Professional Knowledge”

Since the lack of professional experience and reusable procedural knowledge is the key limitation of general-purpose Agents, Skills were created as a targeted solution. The design philosophy of Skills at Anthropic centers on “minimalism” — this simplicity is not accidental, but a deliberate choice.

1. Core Definition of Skills: A Collection of Files Packaging Composable Procedural Knowledge

In simple terms, Skills are “a collection of files that package composable procedural knowledge” — quite literally, a folder. This definition may seem straightforward, but it precisely captures the core demand for “reusability”: it brings together professional operational processes, tool scripts, reference documents, and more for a specific field or task into a single folder, forming a reusable “professional knowledge package”.

2. The Rationale for Minimalist Design: Accessible Creation, Sharing and Management for Everyone

Anthropic designed Skills to be intentionally simple, with the core goal of lowering the barrier to use: 「anyone — whether a human or an AI Agent — can create and use Skills with just a computer」. This design delivers three key advantages:

「Version control」: Skills can be managed with Git to track every modification and iteration;
「Easy sharing」: Skills can be uploaded to Google Drive for sharing, or packaged as zip files and sent to team members;
「Low-barrier creation」: No complex development environment is required, making it accessible to ordinary users.

3. Scripted Tools: Solving the Three Critical Flaws of Traditional MCP Tools

Scripts can be stored in Skill folders as dedicated tools — this design directly addresses the three core problems of traditional MCP (Model Contextual Protocol) tools. Let’s compare the differences between traditional MCP tools and scripted tools in Skills:

Feature	Traditional MCP Tools	Scripted Tools in Skills
Performance with vague instructions	The model is prone to comprehension errors and execution failures	Code acts as its own documentation, with clear logic that reduces errors
Tool modifiability	The model cannot modify the tool itself	Scripts can be directly modified by the model to adapt to different scenarios
Context window usage	The tool permanently occupies the context window, consuming space	Loaded on demand from the file system, no permanent context occupation

The fundamental issues with traditional MCP tools are their static nature, inflexibility and high resource consumption, while scripted tools achieve dynamism, customizability and low resource usage.

4. Real-World Example: Reusing a Python Script for Slide Styling

The Anthropic team shared a highly practical example: when Claude handles slide creation tasks, it used to repeatedly write the same Python script for applying slide styles. Rewriting the script every time not only reduced efficiency but also led to inconsistent formatting.

The solution was simple: let Claude store this script in a Skill directory. For subsequent similar tasks, Claude no longer needs to rewrite the script and can call it directly. This small adjustment significantly improved the consistency and efficiency of task execution — a direct demonstration of the “reusability” value of Skills.

III. Progressive Disclosure: The Core Solution to Context Window Limitations

A mature AI Agent may need to adapt to hundreds of different Skills, and it is clearly impractical to load all Skill content directly into the model’s context window — context windows have limited capacity, and the influx of a large amount of irrelevant content will only crowd out space for core information. To address this, Anthropic proposed the solution of 「Progressive Disclosure」.

1. The Core Pain Point of Multi-Skill Loading: Strained Context Window Resources

An AI model’s context window is like a computer’s memory — it has limited capacity, and priority must be given to storing the most critical information for the current task. If an Agent is configured with hundreds of Skills, each containing detailed instructions, scripts and reference documents, loading all of them will cause two major problems:

The context for the core task is crowded out, leading to scattered model attention;
A large volume of irrelevant information increases the model’s processing costs and reduces execution efficiency.

2. Core Logic of Progressive Disclosure: Metadata First, On-Demand Loading Second

The logic of Progressive Disclosure can be broken down into two simple steps:

「Load only Skill metadata at runtime」: The model first only retrieves basic information for each Skill, including its name and a brief description, to understand the core purpose of each Skill;
「Load full content on demand」: When the Agent determines a Skill is needed for a task, it then reads the complete instruction file (e.g., skill.md) for that Skill from the file system. Other reference materials and scripts are also accessed only when required.

3. Core Value of Progressive Disclosure: Unlimited Scalability of Skills

The key value of this design is that the number of Skills can be expanded infinitely in theory — because the model does not load the complete content of all Skills at once, and only needs to maintain a lightweight metadata list. This means an Agent can adapt to thousands of Skills across different fields without compromising the execution of core tasks due to context window limitations.

IV. 5 Weeks of Explosive Growth: Three Core Application Scenarios for the Skills Ecosystem

In just five weeks after its launch, the Skills ecosystem grew at a rate that exceeded Anthropic’s expectations. This early growth was concentrated in three core scenarios, covering general capability expansion, third-party product integration, and enterprise internal implementation.

1. Basic Capability Skills: Expanding the Core Fundamental Functions of General-Purpose Agents

The core goal of Basic Capability Skills is to supplement general-purpose Agents with entirely new universal capabilities, enabling them to handle more fundamental tasks:

「Anthropic’s in-house example」: Document processing Skills that equip Claude with the ability to create and edit professional-grade Office documents, eliminating the need to develop a dedicated Agent for this task;
「Third-party example」: Scientific research Skills developed by Cadence that allow Claude to perform EHR (Electronic Health Record) data analysis and bioinformatics-related tasks, expanding its basic processing capabilities in the scientific research field.

2. Product Integration Skills: Enabling Agent Compatibility with Third-Party Products

These Skills focus on making Agents better adapted to and able to use third-party products, unblocking the collaboration between Agents and external tools:

「BrowserBase example」: The StageHand Skill, which enables Claude to automate browser operations such as batch processing web data and simulating user interactions;
「Notion example」: Custom-built Skills that help Claude deeply understand the structure of a user’s Notion workspace, accurately extract and organize data in Notion, and adapt to Notion’s usage logic.

3. Enterprise Internal Skills: The Fastest-Growing Skill Implementation Scenario

Enterprise Internal Skills were the fastest-growing category in the first five weeks, with a core focus on institutionalizing a company’s internal knowledge and norms into reusable Skills:

「Organizational best practices」: Packaging enterprise best practices in finance, recruitment, legal affairs and other fields into Skills, enabling Agents to process related tasks in accordance with corporate norms;
「Internal software usage」: Documenting the usage methods of in-house developed software into Skills, allowing Agents to quickly master software operation logic and reduce employee training costs;
「Unified code standards」: For developer teams of ten thousand or more, Skills are used to unify code styles and engineering specifications, improving team collaboration efficiency.

4. A Key Trend: Non-Technical Professionals Are Also Building Custom Skills

The most noteworthy trend in the growth of the Skills ecosystem is this: 「non-technical professionals have begun to become Skill builders」. Practitioners in finance, recruitment, legal affairs, accounting and other roles, even without coding skills, can transform general-purpose Agents into custom tools that meet their work needs through Skills.

Behind this trend lies the value of Anthropic’s minimalist Skill design — no complex development capabilities are required; practitioners only need to organize their professional processes and knowledge to create reusable Skills, endowing general-purpose Agents with industry-specific professional capabilities.

V. Skills + MCP: Complementary, Not Competitive – Building a Complete AI Agent Capability Architecture

A common question arises: will the rise of the Skills ecosystem replace traditional MCP tools? Anthropic’s practice provides a clear answer: Skills and MCP are not competitive, but complementary, and their combination is essential to building a complete AI Agent capability architecture.

1. Core Value of MCP: Connecting Agents to the External World

The core role of MCP (Model Contextual Protocol) tools is to build a bridge for Agents to connect with the external world, including:

Integrating with external APIs: Calling third-party platform interfaces to retrieve data or execute operations;
Accessing databases: Reading structured data from internal or external enterprise sources;
Integrating third-party services: Such as payment, storage and collaboration tools.

In short, MCP solves the problem of “how an Agent can reach external resources” and acts as the “channel” for an Agent’s interaction with the external environment.

2. Core Value of Skills: Institutionalizing Professional Knowledge and Processes

Skills, on the other hand, focus on the problem of “how an Agent can use external resources” — they institutionalize professional knowledge, operational processes and reusable scripts in a specific field. When building Skills, developers often orchestrate multiple MCP tools to string them together into complex workflows that align with business logic.

For example, a “financial report auto-generation” Skill would integrate multiple MCP tools: call a financial system API to pull raw data (MCP) → use a Python script to clean and analyze the data (script in Skills) → call a document tool API to generate an Excel report (MCP) → use a formatting script to unify report styles (script in Skills).

3. A Converging Architectural Model: Four Core Components Form the Heart of Agent Capabilities

With the combined implementation of Skills and MCP, the Agent architecture is converging into a clear and structured model consisting of four core components:

「Agent loop」: Core responsibility for context management, task requirement analysis, and decision-making on which Skills and MCP tools to call;
「Runtime environment」: Provides file system and code execution capabilities to support Skill storage and script operation;
「MCP server」: Serves as the core of external connectivity, integrating with APIs, databases and third-party services;
「Skills library」: A collection of on-demand loadable professional capabilities that institutionalize domain knowledge and operational processes.

4. Implementation Example: Anthropic’s Vertical Domain Solutions

The combination of Skills and MCP has made it possible for Agents to quickly expand into new fields. Immediately after launching Skills, Anthropic rolled out two vertical domain Agent solutions for financial services and life sciences — the core logic was to configure the general-purpose Agent with domain-specific MCP servers and exclusive Skills, without building a new Agent architecture from scratch.

This model validates the value of the Skills+MCP combination: by only adjusting Skills and MCP configurations, a general-purpose Agent can be endowed with vertical domain professional capabilities, drastically reducing the cost of scenario expansion for Agents.

VI. Future Directions: Manage Complex Skills Like Software

As Skills are applied more widely, their form has evolved from simple markdown files to complete packages containing executable files, scripts and assets. In response, Anthropic has proposed three core directions for Skill management in the future, with a central guiding principle: 「treat Skills like software」.

1. Testing and Evaluation: Ensuring the Accuracy of Skill Usage

The core value of Skills is “reusability”, and the prerequisite for reusability is “accuracy”. A robust Skill testing and evaluation system will be essential in the future:

「Scenario testing」: Verifying whether the Agent can correctly select and load the corresponding Skill in different task scenarios;
「Output evaluation」: Checking whether the quality of the Agent’s output meets expectations after using a Skill;
「Exception handling」: Testing Skill performance in abnormal situations such as script errors and resource shortages.

2. Version Control: Tracking the Impact of Skill Evolution on Agent Behavior

Skills will continue to iterate as business needs change, making a comprehensive version control mechanism necessary:

「Version logging」: Recording the modification content, update time and responsible person for each Skill version;
「Behavior tracking」: Analyzing the impact of Skill version changes on Agent task execution to avoid behavioral anomalies caused by version iterations;
「Rollback mechanism」: The ability to quickly roll back to a stable version when a new Skill version encounters issues.

3. Dependency Declaration: Improving the Predictability of Agent Behavior Across Environments

As cross-references between Skills become more frequent, and as Skills increase their dependencies on MCP servers and environment packages, 「dependency declaration」 has become crucial:

「Internal dependencies」: Declaring cross-reference relationships between Skills to ensure loading order and dependency integrity;
「External dependencies」: Declaring information such as MCP server addresses and environment package versions required by Skills;
「Environment adaptation」: Enabling Agents to automatically adapt to different runtime environments based on dependency declarations, ensuring behavioral consistency.

VII. Processor, Operating System, Application: The Definitive Analogy for AI Agent Architecture

Anthropic concluded the sharing session with a vivid analogy that clarifies the relationship between models, Agent Runtime and Skills — and also paints a clear picture of the ultimate direction of AI Agent architecture.

1. Model: Like a Computer’s Processor

An AI model (e.g., Claude) is like a computer’s processor — it requires huge R&D and computing power investment and holds enormous potential, yet it has almost no practical use on its own. The value of a processor can only be unlocked through an operating system and applications; similarly, the capabilities of a model can only be realized with the support of Agent Runtime and Skills.

2. Agent Runtime: Like a Computer’s Operating System

Agent Runtime (the Agent’s runtime environment) is like an operating system — it orchestrates the processes, resources and data around the model, sending the right information (tokens) to and from the model to effectively utilize the model’s computing power and reasoning capabilities. One of the core exploration directions in the industry today is building the most efficient Agent Runtime abstraction layer to maximize model value.

3. Skills: Like a Computer’s Applications

Skills are like computer applications — a small number of companies have the ability to develop processors (models) and operating systems (Agent Runtime), but millions of developers and practitioners around the world can participate in building “applications (Skills)”.

The value of software lies in coding domain-specific professional knowledge and unique perspectives, and this is exactly the core value of Skills: they allow practitioners in every field to institutionalize their professional knowledge into reusable “applications”, endowing general-purpose Agents with custom professional capabilities.

4. Core Goal: Make Agents Grow Stronger with Continuous Use

Anthropic’s ultimate goal is this: a Claude that has collaborated with a user for 30 days will be far more capable than the same Claude on day one. This capability improvement does not rely on model version upgrades, but on the continuous accumulation of Skills — Skills created by Claude today can be efficiently reused by future versions of Claude, making “memory” concrete and transferable.

When Claude is able to independently create and iterate on Skills, the entire Agent system will enter a self-sustaining positive growth cycle: the more it is used, the more Skills it accumulates; the more Skills it accumulates, the stronger its professional capabilities become, forming a closed loop of “usage → accumulation → reuse → improvement”.

FAQ: Answering Core Questions About AI Agent Skills

What are AI Agent Skills?

AI Agent Skills are a concept proposed by Anthropic, essentially a collection of files that package composable procedural knowledge (colloquially, a folder). They are used to supplement general-purpose Agents with professional procedural knowledge and reusable script tools, endowing Agents with domain-specific professional capabilities.

What is the fundamental difference between Skills and traditional MCP tools?

Traditional MCP tools are the “channels” for Agents to interact with the external world, solving the problem of “how to connect to external resources” — and they suffer from issues such as error-prone execution with vague instructions, non-modifiability, and permanent context window occupation. Skills, by contrast, are “professional knowledge packages” that solve the problem of “how to use external resources”. Their built-in script tools are modifiable, loaded on demand, and the code itself acts as documentation to improve execution consistency.

Why can Progressive Disclosure solve the problem of limited context window capacity?

The core of Progressive Disclosure is “on-demand loading”: the model only retrieves Skill metadata (name + brief description) at runtime, which does not consume excessive context space. The complete content of a Skill is only loaded when the Agent determines it is needed. This allows the number of Skills to be expanded infinitely without overwhelming the context window.

Can non-technical professionals create Skills?

Yes, they can. The minimalist design of Skills is specifically intended to lower the creation barrier. Non-technical professionals (e.g., finance, legal and recruitment practitioners) do not need coding skills; they only need to organize their professional processes and knowledge into a collection of files to create Skills that meet their specific needs.

How do Skills and MCP work together to enhance Agent capabilities?

MCP is responsible for building the Agent’s connection to the external world (e.g., APIs, databases, third-party services), while Skills institutionalize the professional knowledge and processes for using these external resources. Developers can orchestrate multiple MCP tools within a Skill to form complex business workflows, enabling Agents to not only reach external resources but also use them professionally to complete tasks.

Why do Skills enable Agents to “grow stronger with use”?

Skills make an Agent’s “memory” concrete and transferable — Skills created by an Agent when processing a task (e.g., scripts, process documents) can be directly reused in the future, eliminating the need for re-derivation. As the Agent is used over time, it accumulates more and more Skills, and its professional capabilities continue to improve — a process that does not rely on model upgrades.

Conclusion

Anthropic’s internal sharing session delivers a core message that provides a more efficient path for the research, development and implementation of AI Agents: abandon the inefficient practice of “building scenario-specific Agents repeatedly” and shift to a “general-purpose Agent + Skills” model. The minimalist design of Skills, Progressive Disclosure, the complementary integration with MCP, and the rapid growth of the Skills ecosystem all validate the feasibility of this model.

In the future, with the improvement of the Skill management system (including testing and evaluation, version control, and dependency declaration) and the widespread participation of non-technical professionals, Skills will become the core bridge connecting general-purpose AI models with professional capabilities across all fields. This will transform AI Agents from their current state of being “highly intelligent but inexperienced” into truly custom tools that are “both smart and professional” — unlocking the full potential of AI in real-world business and daily scenarios.

AI Agent Skills: Why You Should Stop Building Agents & Start Building Skills Instead