How Clawdbot Remembers Everything: A Deep Dive into Its Local, Persistent Memory System
Have you ever found yourself repeating your requirements to an AI assistant because it forgot your previous conversation? Or felt uneasy about your sensitive chats being stored on some distant, unknown cloud server? Clawdbot, a popular open-source project with over 32,600 stars on GitHub, is redefining personal AI assistants with its core tenets of local execution and a persistent memory system.
Unlike cloud-dependent counterparts like ChatGPT or Claude, Clawdbot runs directly on your computer and integrates seamlessly with the chat platforms you already use, such as Discord, WhatsApp, and Telegram. It can autonomously handle real-world tasks like managing emails, scheduling events, and checking into flights. But its true magic lies in a 24/7 context retention system that remembers conversations and builds upon past interactions indefinitely. Crucially, all memory is stored locally, giving you full ownership and control.
Understanding the Foundation: The Critical Difference Between Context and Memory
To grasp Clawdbot, you must first distinguish between two fundamental concepts: “Context” and “Memory.” This distinction is the bedrock of its design.
-
Context is all the information the AI model “sees” for a single request. It includes the system instructions, conversation history, tool execution results, and any attachments. -
Characteristics: Ephemeral (exists only for that request), Bounded (limited by the model’s context window, e.g., 200K tokens), Expensive (every token counts toward API cost and speed).
-
-
Memory is the data persistently stored on disk, forming Clawdbot’s “long-term brain.” -
Characteristics: Persistent (survives restarts, days, months), Unbounded (can grow indefinitely), Cheap (local storage has no API cost), Searchable (indexed for semantic retrieval).
-
In simple terms, Context is the model’s working memory, while Memory is its long-term knowledge base. Clawdbot’s intelligence lies in knowing when to search its vast long-term memory for relevant information and inject it into the limited context window, enabling coherent and informed responses.
The Substance of Memory: A Two-Layer, Plain-Text Philosophy
Clawdbot’s memory system is built on a powerful, simple principle: “Memory is plain Markdown in the agent workspace.” All memory resides in the default ~/clawd/ directory, organized in a clear two-layer structure:
~/clawd/
├── MEMORY.md # Layer 2: Long-term, curated knowledge
└── memory/
├── 2026-01-26.md # Layer 1: Today's notes
├── 2026-01-25.md # Yesterday's notes
└── ...
Layer 1: Daily Logs (memory/YYYY-MM-DD.md)
These are append-only, date-organized running logs. The AI agent writes to these files throughout the day, noting things it wants to remember—conversation snippets, task progress, or user preferences. The format is human-readable:
# 2026-01-26
## 10:30 AM - API Discussion
Discussed REST vs. GraphQL with user. Decision: use REST for simplicity.
Key endpoints: /users, /auth, /projects.
## 2:15 PM - Deployment
Deployed v2.3.0 to production. No issues.
Layer 2: Long-term Memory (MEMORY.md)
This is the refined and structured core knowledge base. When the agent identifies significant events, decisions, user preferences, or learned lessons, it writes them here. It’s more organized:
# Long-term Memory
## User Preferences
- Prefers TypeScript over JavaScript
- Likes concise explanations
## Important Decisions
- 2026-01-20: Chose REST over GraphQL for API architecture
- 2026-01-26: Decided to use Tailwind CSS for styling
This plain-text design offers ultimate transparency. You can read, edit these memory files with any text editor, and even version-control them with Git.
The Engine of Memory: Indexing, Search, and Write Mechanisms
How is Memory Indexed?
When you or the AI save a memory file, an efficient indexing pipeline triggers in the background:
-
File Watching: The Chokidar library monitors changes to all .mdfiles inMEMORY.mdand thememory/directory, with a 1.5-second debounce to batch rapid writes. -
Intelligent Chunking: File content is split into chunks of approximately 400 tokens, with an 80-token overlap between chunks. This overlap ensures facts spanning a boundary are captured in both chunks. -
Vectorization: Each chunk is converted into a 1536-dimensional vector via an embedding model (like OpenAI’s text-embedding-3-small). -
Hybrid Storage: Vectors and text are stored in a lightweight SQLite database ( ~/.clawdbot/memory/<agentId>.sqlite). Key components include:-
The sqlite-vecextension enables vector similarity search. -
SQLite’s built-in FTS5engine powers full-text keyword search (BM25 algorithm). -
A hash cache prevents re-embedding identical content.
-
How is Memory Searched?
When the AI needs to recall something, it doesn’t clumsily load all memory. Instead, it uses two specialized tools for precise retrieval:
-
The
memory_searchTool: Performs hybrid search. It runs semantic vector search and keyword full-text search in parallel. Results are combined using a weighted score:finalScore = 0.7 * vectorScore + 0.3 * textScore(configurable). This ensures high relevance whether you query a concept (“that earlier database discussion”) or a specific term (“POSTGRES_URL”). Snippets below a default threshold of 0.35 are filtered out.Memory Search Tool Example -
The
memory_getTool: Aftermemory_searchlocates relevant files, this tool reads specific lines from a file, pulling detailed information into the current context.
How is Memory Written?
Clawdbot has no dedicated “memory_write” tool. The AI uses the same standard write and edit tools it uses for any file to modify memory files. The decision of what to write and where is intelligently driven by the system prompt. The AI analyzes the conversation to decide if information belongs in the daily log or should be refined into the long-term memory.
Managing Infinite Conversations: Compaction, Flushing, and Pruning
Even models with 200,000 or 1 million token context windows eventually run out of space. Clawdbot employs a sophisticated set of “memory management” strategies for long dialogues.
Compaction
When conversation history nears the context window limit, Clawdbot triggers automatic compaction. It uses the LLM to summarize older, less immediate turns (e.g., turns 1-140) into a concise paragraph, while keeping recent, critical dialogue intact (e.g., the last 10 turns). This summary is persisted to the session’s JSONL transcript file, allowing future sessions to start with this condensed history, preserving the essence and continuity within the limited window.
The Pre-Compaction Memory Flush
Compaction is lossy. To prevent vital information from being omitted in the summary, Clawdbot executes a silent memory flush just before compaction. The system inserts a special instruction, directing the AI agent to immediately write the most crucial decisions and facts from the current conversation into the memory/YYYY-MM-DD.md file. After saving, the agent replies with NO_REPLY, invisible to the user. This ensures core information is safely on disk before any potential loss from summarization.
// Example memory flush configuration (clawdbot.json)
"memoryFlush": {
"enabled": true,
"softThresholdTokens": 4000, // Executes when 4000 tokens away from the compaction trigger
"prompt": "Write lasting notes to memory/YYYY-MM-DD.md; reply NO_REPLY if nothing to store."
}
Pruning and Cache-TTL Pruning
Tool executions (like running commands or reading files) can produce huge outputs (e.g., 50,000-character logs). The pruning function intelligently cleans these old, bulky tool results from the context: either truncating them, keeping head and tail sections, or replacing them entirely with a placeholder like [Old tool result content cleared]. This saves significant context space. The original, full output remains unchanged on disk.
Cache-TTL Pruning is a cost-optimization strategy. APIs like Anthropic’s cache prompt prefixes to reduce cost on repeated calls, but this cache expires (e.g., after 5 minutes). If a session idles past this TTL, the next request must re-cache the entire history at full price. Clawdbot can be configured to, upon detecting cache expiry, prune old history so that only a shorter prompt needs re-caching, thereby reducing cost.
Advanced Features: Multi-Agent Isolation and Session Lifecycle
Multi-Agent Memory Isolation
Clawdbot supports running multiple, independent AI agents (e.g., a “personal” and a “work” agent). Each has completely isolated workspaces and memory indexes:
~/.clawdbot/memory/ # Index database directory
├── main.sqlite # Index for "personal" agent
└── work.sqlite # Index for "work" agent
~/clawd/ # "Personal" agent workspace (source files)
~/clawd-work/ # "Work" agent workspace (source files)
By default, agents cannot automatically access each other’s memories, ensuring perfect context separation and privacy.
Session Lifecycle and Memory Hooks
Sessions don’t last forever. Clawdbot allows configurable session reset rules (e.g., daily reset, manual reset). When starting a new session via the /new command, a Session Memory Hook can be triggered. This feature automatically extracts the last few messages from the ending session, generates a descriptive title via the LLM, and saves it as a standalone memory file (e.g., 2026-01-26-api-design.md). This makes the context of that temporary conversation discoverable by future memory_search operations.
Conclusion: The Design Philosophy Behind Clawdbot’s Memory
Clawdbot’s persistent memory system works and earns trust because it adheres to several core principles:
-
Transparency Over Black Boxes: Memory is plain Markdown. You can read, edit, and version-control it. There are no opaque databases or proprietary formats. -
Search Over Injection: Instead of blindly stuffing all memory into context, the agent performs precise hybrid search (semantic + keyword) for what’s relevant. This keeps context focused and costs down. -
Persistence Over Session: Important information lives in files on disk, not just in volatile conversation history. Compaction cannot destroy what’s already been saved. -
Hybrid Over Pure: Combining vector search for understanding meaning with BM25 keyword search for exact term matching ensures good results for both fuzzy queries and specific lookups.
Frequently Asked Questions (FAQ)
Q1: Is Clawdbot’s memory truly fully local? Does anything get uploaded to the cloud?
A: Yes, the memory files (Markdown) and index database (SQLite) are stored locally on your machine. The indexing process (vectorization) will involve an API call if a cloud-based embedding model (like OpenAI’s) is used, but the resulting vector data is stored only in your local SQLite file.
Q2: How can I manually add or edit memories?
A: You can directly open and edit the MEMORY.md or any memory/YYYY-MM-DD.md file within your ~/clawd/ directory using a text editor. Once saved, Clawdbot’s file watcher will detect the change and automatically re-index the file, making your edits searchable.
Q3: Does memory indexing consume a lot of disk space or memory?
A: The index database is typically very lightweight. Text is stored efficiently, and vector data size is manageable. The primary disk usage comes from the original Markdown memory files themselves, which are plain text and highly efficient. The system is designed to run smoothly on personal hardware.
Q4: If I have a very long conversation, will compaction cause information loss?
A: Compaction itself is lossy. However, Clawdbot mitigates this with the pre-compaction memory flush mechanism. This forces the AI to write key information to disk before summarization happens, ensuring the most critical content is persisted and won’t be lost in the summary.
Q5: Can memory be continuous across different chat platforms like Discord and Telegram?
A: Yes. As long as the conversation is handled by the same Clawdbot agent instance, memories from all integrated platforms are written to the same set of memory files, enabling a continuous, cross-platform context experience.
By returning control of memory to the user and implementing persistence in a transparent, efficient, and robust manner, Clawdbot offers a powerful open-source solution for users who value privacy, customization, and long-term, coherent AI interaction. It proves that a small, locally-run AI assistant can possess a “memory” rivaling that of its cloud-based counterparts.

