Synchronous Blocking vs. Asynchronous Orchestration: A Deep Dive into Hermes Delegate and OpenClaw Multi-Agent Architectures
When you need multiple AI agents to collaborate on complex tasks, should you choose a “command-and-control” management style or a “symphony orchestra” loose orchestration? This decision directly determines your system’s response speed, resource consumption, and scalability.
In the evolution of modern AI agent systems, efficient multi-agent collaboration has become a central proposition. Mainstream solutions on the market show polarization: one end is represented by Hermes with its synchronous blocking model, pursuing extreme isolation and token efficiency; the other end is represented by OpenClaw with its asynchronous orchestration system, emphasizing flexibility and dynamic intervention capabilities. This article deeply analyzes the essential differences between these two design philosophies in architecture, communication patterns, and applicable scenarios, helping you make correct architectural decisions in actual projects.
The Collision of Two Design Philosophies
Core Question: What are the fundamental differences between Hermes and OpenClaw, and why do they represent two extremes in multi-agent collaboration?
Hermes adopts a “general contractor-subcontractor” model. The parent agent acts like a project manager, breaking down tasks and distributing them to child agents, then blocking and waiting for all results to return. This design pursues extreme isolation: all intermediate reasoning processes of child agents never pollute the parent agent’s context window, ultimately returning only a refined summary.
OpenClaw’s subagent system operates more like a symphony orchestra conductor. After the parent agent defines the global topology, individual agents run asynchronously as fixed roles, collaborating through event push mechanisms. More critically, the parent agent can even send “steering messages” to adjust direction while child agents are running—a flexibility Hermes lacks.
These differences represent not just technical implementations but divergent design philosophies: Hermes believes “less context is better,” preventing information pollution through strict isolation; OpenClaw argues that “moderate visibility is necessary,” achieving dynamic coordination through controlled information flow.
Hermes Delegate: The Efficient Parallel Executor
Core Question: How does the Hermes delegate_task mechanism work technically, and what are the critical implementation details?
Hermes implements its functionality in a concentrated 978-line delegate_tool.py file—a line count that itself suggests a lightweight, single-responsibility-focused design. It supports two working modes:
Single-Task Mode: Provides a goal parameter to launch a single child agent for synchronous execution, suitable for simple task delegation.
Batch Mode: Accepts an array of up to 3 tasks, processing them in parallel through ThreadPoolExecutor. This is a hard-coded limit that cannot be changed through configuration.
Subagent Construction and Isolation Strategy
Each delegation creates a fresh AIAgent instance. This “use-and-discard” philosophy ensures strict context isolation:
-
Independent conversation history: Child agents do not inherit any intermediate reasoning from the parent agent -
Independent Terminal session: Each task_idcorresponds to an independent working directory and state -
Specially constructed System Prompt: Structure is concise and explicit, starting with “You are a focused subagent working on a specific delegated task,” including task objectives, optional context information, and the working directory path parsed from the parent agent
Child agent configuration follows strict inheritance rules: models can inherit from the parent agent or use delegation.model configuration; toolsets are calculated through intersection (children cannot obtain tools the parent lacks); credentials can share the parent’s credential pool or be configured independently.
Workspace Rules are emphasized in the System Prompt: “Never assume a repository lives at /workspace/……” This explicit path declaration prevents child agents from failing due to incorrect path assumptions. In actual development, this defensive programming significantly reduces path-related bugs.
Application Scenario: Imagine you need to analyze three unrelated datasets simultaneously. You want each analysis to proceed in isolation without intermediate pandas DataFrame outputs polluting your main conversation context. Hermes is designed exactly for this scenario—you delegate three parallel tasks, each receiving only the essential parameters, and receive three clean summaries back without the messy intermediate steps.
Security Sandbox: Tool Deprivation Mechanism
To prevent失控 (loss of control), Hermes implements strict tool deprivation for child agents. This “principle of least privilege” is crucial in security-sensitive scenarios:
-
delegate_taskitself is disabled: Prevents recursive delegation that could cause infinite loops and resource exhaustion -
clarifyis removed: Prevents interaction with users, avoiding fragmented user experience -
memorytools are stripped: Protects shared memory from accidental modification or pollution by child agents -
execute_codeis disabled: Avoids security risks from child agents writing complex scripts -
send_messageis removed: Prevents potential abuse from cross-platform message sending
Author’s Reflection: This strict security strategy reminds me of the “bulkhead” design in microservice architecture. While it appears overly restrictive, in production environments, an unconstrained child agent could indeed exhaust server resources with a single improper execute_code call or pollute global state through memory tools. The Hermes designers have clearly experienced these pain points firsthand.
Lifecycle and Depth Control
Hermes implements strict depth limiting: MAX_DEPTH = 2. This means a parent agent (depth 0) can spawn child agents (depth 1), but child agents cannot spawn grandchild agents (depth 2 requests are rejected). Each child agent increments the depth marker _delegate_depth during construction, checking for limit violations before invocation.
This “single generation” limit appears constraining but represents deliberate complexity control. In actual projects, I’ve observed that deeply nested agent hierarchies often lead to debugging difficulties and exponential growth in fault localization complexity.
Child agents default to 50 iterations (DEFAULT_MAX_ITERATIONS, configurable via config.yaml). The critical flaw is the lack of time-based timeout control: future.result() calls lack timeout parameters, meaning child agents can theoretically run indefinitely until exhausting iteration counts. If a child agent calls a slow LLM each round, 50 iterations might run for hours.
Token Efficiency and Communication Patterns
Hermes adopts a pure unidirectional Request-Response communication pattern. The parent agent constructs goal and context, calls delegate_task(), then blocks waiting. Child agents have zero communication between them, and all intermediate tool calls and reasoning chains never enter the parent’s context window.
The only “progress notification” is UI-level callbacks (printing tool calls above the CLI spinner, or batch relaying tool names in Gateway), but this is not semantic communication—merely progress display.
This design is extremely efficient in token consumption: Assuming a parent agent context of 10K tokens, delegating 3 subtasks each consuming 5K tokens, total consumption is merely 10K + 3×5K = 25K tokens. Child agent results are compressed into refined summaries, with the parent agent’s context experiencing near-zero bloat.
Operational Example: Consider a code review scenario. The parent agent delegates three files to three child agents for review. Each child agent performs 10-15 tool calls (reading files, searching documentation, analyzing patterns), but the parent agent only receives three final summaries: “File A has 2 security issues,” “File B is clean,” “File C needs refactoring.” The parent never sees the intermediate grep commands, file reads, or reasoning chains—saving thousands of tokens.
Cascade Mechanism for Exception Termination
Child agent credential resolution follows a three-level priority:
-
First checks delegation.base_urlconfiguration; if present, uses that endpoint and api_key -
Second checks delegation.providerconfiguration, resolving full credentials through the runtime provider system -
If neither is configured, child agent inherits all credentials from the parent agent
When child and parent agents use the same provider, they share the same credential pool (cooldown status synchronized); otherwise, they load that provider’s independent pool. Each child agent obtains a credential lease, automatically releasing it in the finally block.
Lessons Learned: This design has an obvious shortcoming—the lack of timeout mechanisms. In production environments, if a child agent enters an infinite loop or calls an external slow API, the parent agent waits indefinitely. I’ve experienced the尴尬 (awkwardness) of a data analysis subtask getting stuck processing an abnormally large file, causing the entire workflow to stall for hours. This is Hermes’ most critical area for improvement.
OpenClaw Subagent: The Asynchronous Orchestration System
Core Question: What capabilities does OpenClaw offer that Hermes lacks, and how does its asynchronous architecture solve real pain points?
OpenClaw constructs a complete hierarchical agent architecture, defining three roles:
-
main: Primary agent responsible for global coordination -
orchestrator: Can continue spawning child agents -
leaf: Terminal nodes that cannot spawn further, focused on execution
Default configuration supports up to 8 concurrent paths globally, with each agent managing up to 5 active child agents, and configurable nesting depth (unlike Hermes’ hard-coded limits).
Asynchronous Event-Driven Communication
OpenClaw’s core innovation is the asynchronous push mechanism. After spawning a child agent, the parent immediately returns to continue processing other tasks. When child agents complete, they push results back to the parent via runSubagentAnnounceFlow() as “user messages.”
Documentation explicitly states: “Do not call sessions_list, sessions_history, or any polling tools; wait for completion events to arrive automatically.” This push design avoids polling overhead. In event-driven architecture, parent agents can concurrently manage multiple child agents without blocking.
Unique Insight: This pattern reminds me of the difference between WebSockets and traditional polling in modern web development. Hermes is like a synchronous Ajax call—simple but blocking; OpenClaw is like WebSocket—complexer but capable of handling true concurrency. For scenarios requiring simultaneous monitoring of multiple long-running tasks, asynchronous mode is almost the only choice.
The Steer Mechanism: Runtime Dynamic Intervention
OpenClaw’s most powerful capability is the Steer mechanism. Parent agents can send steering messages to running child agents, triggering them to adjust direction. This capability has rate-limiting protection (max once per 2 seconds), with messages up to 4,000 characters.
This is precisely the capability Hermes lacks—when users send new messages in Gateway, Hermes can only violently interrupt all child agents, while OpenClaw can judge user intent and selectively steer or query progress.
Application Scenario: Imagine a research agent retrieving materials when the user suddenly wants to modify search keywords. In Hermes, you must kill the entire research agent and restart; in OpenClaw, you send a steer message: “Please focus on papers after 2024,” and the child agent adjusts direction while continuing to run—the completed retrieval work isn’t wasted.
Operational Example: The user types “?” to check progress. In Hermes, this triggers an interrupt, killing all child agents and losing all intermediate work. In OpenClaw, the system can interpret this as a status query, return current progress (“Completed 60% of phase 1, currently analyzing section 3”), and allow the child agent to continue running. This优雅 (elegant) handling of user intervention is crucial for production systems.
Lifecycle Management
OpenClaw provides comprehensive lifecycle management:
-
runTimeoutSecondsparameter (default 300 seconds): Provides time-based control, preventing infinite runs -
runs.jsonpersistence: Saves execution records supporting orphan recovery -
resumeSessionId: Allows resuming existing sessions, avoiding repeated work
These mechanisms suit long-process, multi-stage complex tasks. In production environments requiring cross-session state maintenance, persistence capabilities are indispensable.
Author’s Reflection: The 300-second default timeout might seem short for complex research tasks, but it’s a sane default for production safety. I’ve learned that in production systems, “fail fast” is often better than “hang indefinitely.” The orphan recovery feature is particularly valuable—it handles cases where the parent crashes but child agents continue running, allowing the system to reclaim those resources when the parent restarts.
Communication Cost Analysis
Flexibility comes at a cost. OpenClaw’s announce pushes inject child agent results into the parent agent’s context, and frequent steer messages generate additional token overhead.
Assuming the same 3-subtask scenario, each announce consumes approximately 1K tokens, growing the parent agent’s context from 10K to 13K, with total consumption around 28K tokens—12% more than Hermes. In high-frequency interaction scenarios, this overhead accumulates.
Reflection: This trade-off makes me consider “hidden costs.” While OpenClaw consumes more tokens per task, if you account for resource waste from Hermes’ lack of timeout mechanisms (a stuck task might occupy resources for hours), OpenClaw’s total cost of ownership might actually be lower. Architectural choices cannot look only at surface metrics.
The Deep Logic of Architectural Differences
Core Question: Why do these design differences exist? What are the advantages of each from information theory and security perspectives?
The fundamental divergence between the two lies in the essential difference between synchronous and asynchronous approaches:
Hermes’ synchronous blocking means the parent agent is completely stalled during child execution, unable to respond to new demands. This “single-threaded” thinking simplifies state management but sacrifices concurrency.
OpenClaw’s asynchronous push allows parent agents to simultaneously manage multiple child agents, dynamically adjusting strategies. This requires more complex state machines but provides true parallel processing capabilities.
From an information theory perspective, Hermes’ child agent is a lossy compressor: compressing large amounts of intermediate tool calls and reasoning chains into a summary string. This design is extremely efficient in “single-point decision + batch execution” scenarios: the parent agent does planning, child agents only execute and return, with zero context bloat.
OpenClaw is more like a distributed system, requiring maintenance of complex topology states and message routing. Its Steer capability solves Hermes’ biggest pain point: graceful handling when users query progress.
From a security perspective, Hermes’ strong isolation forms a natural sandbox—child agents are stripped of all “escape” tools, suitable for security-sensitive scenarios. OpenClaw’s child agents retain more capabilities but also mean larger attack surfaces and potential side effects.
Practical Selection Guide
Core Question: How to make the correct choice when facing specific projects?
Scenarios for Choosing Hermes
If you need to process 3 unrelated things in parallel without intermediate data polluting the main conversation, Hermes is the best choice:
-
Quick Integration: Natural parallelism, clean isolation, one tool call to get it done -
Token-Sensitive: Extreme token efficiency, suitable for scenarios sensitive to long-context model costs -
Simple Subtasks: Deterministic tasks not requiring mid-course direction adjustments
Specific Operations: Configure max_iterations in config.yaml to control execution rounds (default 50, adjust based on task complexity). Use delegation.model to specify child agent models (you can use lighter models than the parent to save costs). Ensure all subtasks are truly unrelated to avoid coordination scenarios.
Scenarios for Choosing OpenClaw
If your task requires adjusting direction while child agents are running, or needs timeout control to prevent infinite execution:
-
Dynamic Intervention: Requires Steer mechanism to adjust running tasks -
External Agent Integration: Calling Claude Code, Codex, or other external tools as child agents (via ACP protocol) -
Long-Process Tasks: Requires persistence and recovery mechanisms (runs.json + orphan recovery) -
High Concurrency Needs: More than 3 agents working in parallel, requiring 8-way concurrency + 5 children per agent configuration
Specific Operations: Configure runTimeoutSeconds to prevent resource waste (recommended 300-600 seconds based on task type). Utilize resumeSessionId for fault tolerance, especially for long-process tasks. Design proper agent role topology (hierarchical relationships between main/orchestrator/leaf).
Decision Matrix
Hybrid Architecture: The Best of Both Worlds
Core Question: Can both patterns be used together in one system? How should it be designed?
The two are not mutually exclusive. Best practice is to choose appropriate patterns based on task characteristics: use OpenClaw’s asynchronous orchestration layer for complex workflows and multi-agent collaboration, embedding Hermes-style delegation calls when encountering subtasks requiring isolated parallelism.
Practical Case: A main agent spawns three agents via OpenClaw’s sessions_spawn:
-
Research agent (calls Claude Code via ACP for deep retrieval) -
Analysis agent (local embedded processing) -
Writer agent (responsible for document generation)
The analysis agent internally uses Hermes’ delegate_task to process 3 unrelated data analysis tasks in parallel, leveraging its zero-context-bloat advantage. Meanwhile, research and writer agents collaborate through OpenClaw’s announce and steer mechanisms, maintaining flexibility.
Author’s Reflection: This “layered isolation” design reminds me of cache hierarchies in computer architecture. L1 cache (Hermes) is fast but small, suitable for local computation; L2 cache (OpenClaw) is large but high-latency, suitable for global coordination. Good architects know to use appropriate tools at different layers rather than trying to solve everything with one approach.
Operational Example Implementation:
# In your main orchestrator (OpenClaw mode)
agents:
- role: orchestrator
name: project_manager
children:
- role: leaf
name: researcher
type: external_acp # OpenClaw external agent
- role: leaf
name: analyzer
type: embedded
# Inside this agent's tools:
# Uses Hermes delegate_task for 3 parallel sub-analyses
- role: leaf
name: writer
type: external_acp
Evolution Opportunities: Improvement Directions for Hermes
Core Question: Based on comparison with OpenClaw, what are the specific improvement directions for Hermes?
Based on pain points encountered in actual usage, Hermes has several high-priority improvement opportunities:
Highest Priority: Timeout Mechanism
Add timeout_seconds parameter to delegate_task to prevent child agents from running indefinitely. This is the most urgent need in production environments, referencing OpenClaw’s runTimeoutSeconds implementation.
Improved Gateway Interrupt Logic
When users send short messages like “?” or “status,” return progress rather than killing all agents. This requires distinguishing between “query intent” and “new command intent.”
Medium Priority Improvements
-
Introduce Steer Capability: Allow parent agents to send steering messages to running child agents, requiring redesign of the communication protocol -
Remove Concurrency Limits: Make MAX_CONCURRENT_CHILDRENconfigurable (currently hard-coded at 3) -
Support Persistence and Recovery: Save child agent state for resumption after interruption, referencing runs.json implementation
Long-term Evolution
Supporting external agents (launching Claude Code/Codex as child agents via ACP) would significantly expand Hermes’ capability boundaries, upgrading it from an “internal tool” to an “open ecosystem.”
Author’s Reflection: The timeout mechanism is non-negotiable for production readiness. I’ve seen too many systems where a single hanging subtask ruins the user experience. The lack of this feature suggests Hermes was designed primarily for controlled development environments rather than chaotic production realities. Adding this would be a significant maturity milestone.
Conclusion
Hermes and OpenClaw represent two extremes in multi-agent collaboration: one pursues extreme isolation and efficiency, the other maximum flexibility and control. Understanding their design trade-offs enables correct architectural choices in actual projects.
Future agent systems will likely be a fusion of these two philosophies—isolating strongly when isolation is needed, collaborating strongly when collaboration is needed, letting architecture serve task essence rather than the reverse.
Practical Summary and Action Checklist
Core Question: If you can only remember three points, what should they be? How can you start applying this knowledge tomorrow?
One-Page Overview
Hermes Delegate = Synchronous Blocking + Strict Isolation + Token Efficient
-
Like a project manager subcontracting tasks, waiting for all results to return -
Child agent intermediate processes invisible, returns only summary -
Best for: Batch parallelism, simple subtasks, token-sensitive scenarios -
Limitations: No timeout, hard-coded 3-concurrency, cannot dynamically intervene
OpenClaw Subagent = Asynchronous Orchestration + Dynamic Intervention + Lifecycle Management
-
Like a symphony conductor, asynchronously coordinating multiple roles -
Supports Steer mechanism to adjust running tasks -
Best for: Complex workflows, long processes, external agent integration -
Cost: 12% higher token overhead, complex architecture
Implementation Checklist
If you choose Hermes:
-
[ ] Set reasonable max_iterationsin config.yaml (default 50, adjust based on task complexity) -
[ ] Ensure all subtasks are truly unrelated, avoiding scenarios requiring coordination -
[ ] Monitor task execution time, prepare external timeout control mechanisms (e.g., Docker timeout) -
[ ] Use lightweight models as child agent models to save costs
If you choose OpenClaw:
-
[ ] Configure runTimeoutSecondsto prevent resource waste (recommended 300-600 seconds based on task type) -
[ ] Design proper agent role topology (hierarchical relationships between main/orchestrator/leaf) -
[ ] Utilize resumeSessionIdfor fault tolerance, especially for long-process tasks -
[ ] Monitor token consumption, avoid frequent steer operations causing context bloat
If you choose Hybrid Architecture:
-
[ ] Use OpenClaw orchestrator layer for global coordination -
[ ] Embed Hermes processing for parallel subtasks at leaf nodes -
[ ] Clearly distinguish between “tasks requiring coordination” and “tasks that can be isolated” -
[ ] Establish monitoring systems, tracking resource consumption of both patterns separately
Frequently Asked Questions (FAQ)
Q: Can Hermes’ 3-concurrency limit be bypassed by modifying source code?
A: Technically yes, by modifying the hard-coded value in delegate_tool.py, but not recommended. This limit is part of the design philosophy; forcibly bypassing it may cause unforeseen stability issues. If higher concurrency is needed, migrate to OpenClaw architecture.
Q: Does OpenClaw’s Steer mechanism interrupt the child agent’s current operation?
A: It does not interrupt immediately but adds the steering message to the child agent’s context, influencing its next-round decision. This “soft intervention” is more elegant than forced interruption but has delayed effect (depends on child agent iteration frequency).
Q: In Hermes, what happens if a child agent cannot complete the task within 50 iterations?
A: The task ends normally and returns current results (usually partial completion status). It is recommended to ensure 50 iterations are sufficient during task design, or implement checkpoint mechanisms at the application layer, breaking large tasks into multiple serializable subtasks.
Q: Is OpenClaw’s 12% token overhead increase a fixed value?
A: This is an estimated value based on 3 subtasks. Actual overhead depends on announce frequency and steer message volume. In high-frequency interaction scenarios, overhead may increase significantly; actual testing is required.
Q: Can both modes be mixed in the same agent instance?
A: Technically feasible but requires careful design. It is recommended to use OpenClaw as the outer layer orchestration (handling session management), using Hermes-style processing for specific leaf agent internal batch subtasks, avoiding circular dependencies.
Q: How is the working directory determined for Hermes child agents?
A: The working directory path parsed from the parent agent is explicitly written into the child agent’s System Prompt, following the principle of “never assume paths.” Ensure the parent agent provides explicit working_dir parameters when calling.
Q: In Gateway products, when users send “?” to check progress, how to avoid accidentally killing all child agents?
A: Current Hermes implementation will interrupt all child agents—this is a known limitation. Short-term solution is preprocessing user input at the application layer, identifying query intent and directly returning cached progress information without passing to the agent layer.
Q: For long-duration tasks requiring external API calls (like video generation), which mode is more suitable?
A: OpenClaw is more suitable. Its timeout mechanism and asynchronous push better handle uncontrollable external API delays, and it supports orphan recovery to cope with instability in external services.

