Site icon Efficient Coder

Claude-Mem: How to Make AI Remember Your Last Session & Boost Productivity

Claude-Mem: Turn One-Off Chats into Perpetual Project Memory

Core question answered in one line: How can Claude remember what we did last time, even after a full restart?
Claude-Mem quietly captures every tool call, compresses it into searchable observations, and surfaces the right context when you ask—no manual notes, no copy-paste.


Quick Start: Three Commands to Persistent Memory

Core question: What is the absolute fastest way to get memory running?
Install the plugin, restart Claude Code, and speak naturally—memory appears in under two minutes.

Step Command What happens Typical time
1 /plugin marketplace add thedotmack/claude-mem Adds plugin index 8 s
2 /plugin install claude-mem Downloads hooks + worker 25 s
3 Restart Claude Code Hooks register, worker starts on 37777 4 s

After restart, ask “What did we do last session?” If you see an automatic mem-search call followed by a concise answer, the loop is live.

Author reflection
On my first run, the worker port collided with a local Grafana instance. The error is buried in ~/.claude-mem/logs/worker.log; a quick lsof -i :37777 and a port override in settings.json fixed it. The plugin could shout louder about port clashes, but once you know where to look, debugging is trivial.


How It Works: Five Hooks, Ten End-points, One Viewer

Core question: Which moving parts turn chat into recallable memory?
Five life-cycle hooks capture everything, a Bun-powered HTTP worker indexes it, and SQLite + Chroma let you query in plain English.

1. Hook time-line (what runs when)

Hook Fires Captures Script
SessionStart New chat begins Working dir, env vars hook-start.ts
UserPromptSubmit You press Enter Raw prompt, referenced files hook-prompt.ts
PostToolUse Any tool finishes Stdout, stderr, exit code hook-posttool.ts
Stop You /stop Manual halt reason hook-stop.ts
SessionEnd Chat ends Full summary, token count hook-end.ts

PostToolUse actually splits into “success” and “failure” branches, so six scripts but five hooks.

2. Data flow (simplified)

graph TD
    A[User prompt] -->|submitted| B(Observation created)
    B --> C{SQLite store raw}
    C --> D[Chroma vectorise]
    D --> E[Wait for query]
    E -->|natural language| F[mem-search endpoint]
    F --> G[Return ranked snippets]

3. Search end-points at a glance

End-point One-line purpose Example query
/search/observations Full-text across everything “How did we fix the port conflict?”
/search/sessions Across session summaries “Show performance tuning last week”
/search/concepts By semantic tag “List all discoveries”
/search/files By referenced filename “Changes touching worker-service.ts”
/recent Last N observations “What happened recently?”
/timeline Around a timestamp “Context near 2025-01-05 15:00”
/help Auto-generated docs

Progressive disclosure keeps token cost low: first payload ≈ 500 tokens; follow-up “tell me more” adds another 1 500 if needed.

Author reflection
The beauty is that you never type an SQL clause or vector-distance formula—you just ask. I often phrase queries as if talking to a teammate, and the hit-rate is uncanny. The secret sauce is hybrid scoring: BM25 for keywords + cosine for semantics, blended with a freshness boost.


Real-World Scenarios: Three Scripts You Can Reuse Today

Core question: Where does memory actually save time?
Below are copy-paste-ready mini-scripts for SDK continuity, recurring bug fixes, and onboarding new reviewers.

Scenario A: Keep SDK naming consistent across Mondays

Problem
Monday you decide all factory functions start with create. Wednesday Claude suggests newFoo. Divergence creeps in.

Steps

  1. Monday closing prompt:
    “Summarise today’s SDK naming rule in one sentence, do not store code.”
  2. Wednesday opening prompt:
    “What naming rule did we set for the SDK?”
  3. Claude calls mem-search, returns Monday’s one-liner, continues with create prefix.

Outcome
Zero drift; no manual style guide needed.

Scenario B: Recurring port-conflict fix

Problem
npm run dev throws EADDRINUSE :::3000. You swear you solved this last week.

Steps

  1. Ask: “How did we solve port 3000 being in use?”
  2. mem-search surfaces an observation tagged bug-fix, file field package.json.
  3. Viewer shows you changed the port to 3777 and added --if-present.

Outcome
Thirty-second copy-paste instead of another Stack Overflow detour.

Scenario C: New reviewer needs historical context

Problem
Reviewer sees debounce code in worker-service.ts and questions its necessity.

Steps

  1. In PR comment: “@claude why is this debounce here?”
  2. Claude queries /search/files + /timeline around the commit.
  3. Returns: “Added 2025-01-03 to prevent Redis reconnect burst—kept 500 ms.”
  4. Reviewer approves with confidence.

Outcome
Knowledge self-serve; no shoulder-tapping senior devs.


Installation Deep Dive: Dependencies, Ports, Upgrades

Core question: What can still go wrong after the three happy-path commands?
Version mismatches, port collisions, and privacy tags are the top three gotchas.

1. System requirements matrix

Component Min version Auto-installed? Check command
Node.js 18.0.0 No node -v
Bun runtime 1.0.0 Yes bun -v
uv (Python) 0.4.x Yes uv --version
SQLite 3.37 Built-in sqlite3 -version

If Node < 18, the pre-hook script aborts with a clear message—no mystery failures.

2. Configuration sample (~/.claude-mem/settings.json)

{
  "model": "claude-3-5-sonnet-20241022",
  "workerPort": 37777,
  "dataDir": "~/.claude-mem/data",
  "logLevel": "info",
  "contextInjection": {
    "maxToken": 2048,
    "strategy": "progressive"
  },
  "privacy": {
    "excludeTags": ["private", "password"],
    "maskPattern": "\\b\\d{4,}\\b"
  }
}

Tweak workerPort if 37777 is taken; raise maxToken only if you routinely need deep context.

3. Upgrade / rollback flow

Intent Command Note
Move to beta Web UI → Settings → Channel → Beta Keeps existing DB
Roll back Switch to Stable Downgrade无损
Force latest /plugin update claude-mem Runs migration scripts

Author reflection
I once flipped to beta and hit a schema migration bug. The rollback button in the web UI instantly re-linked the stable plugin binaries—no data loss, but the incident reminded me to export a DB copy before channel hopping. A five-second cp data/claude-mem.db backups/ now lives in my pre-upgrade ritual.


Cost & Performance: Token Spend in the Real World

Core question: Does perpetual memory balloon your API invoice?
Progressive retrieval keeps the median injection under 600 tokens—roughly 15 % of a normal long-prompt turn.

4-week project stats (260 sessions)

Metric Value
Observations 8 700
Avg tokens / observation 47
Median query injection 512 tokens
Hit-rate (user asks “more”) 78 %
Memory cost vs baseline +14 %

Tips to stay lean:

  • 🍂
    Wrap verbose logs in <private> tags.
  • 🍂
    Store one-line summaries instead of full code blocks.
  • 🍂
    Always ask “is there…” first; expand only if relevant.

Privacy, Security & Compliance

Core question: How do I keep secrets out of the index?
Use inline <private> tags or list sensitive regex in maskPattern; either way, excluded text never hits disk.

Example prompt:
“Add the connection string mysql://user:pass@localhost:3306/db to .env”

The connection string remains in RAM for the current turn but is absent from both SQLite and Chroma.

For ultra-sensitive repos, point dataDir to an encrypted volume and wipe it after the session:
rm -rf ~/.claude-mem/data


Current Limits & Road-map

Core question: Where should I NOT rely on Claude-Mem yet?
Cross-project recollection, multimedia indexing, and multi-user shared memory are explicitly future work.

Limit Today Planned
Project isolation 1 DB per folder v7: federated search
Images / audio Path only Future: base64 + vision
Team sharing Local disk v8: E2E-encrypted sync

Action Checklist / Implementation Steps

  1. Verify Node ≥ 18: node -v
  2. Install:
    • 🍂
      /plugin marketplace add thedotmack/claude-mem
    • 🍂
      /plugin install claude-mem
  3. Restart Claude Code → open http://localhost:37777
  4. Test memory: “What did we do last session?”
  5. (Optional) Adjust workerPort or privacy patterns in settings.json
  6. Before sensitive work, add <private> tags or extra excludeTags
  7. Upgrade channel only after backing up data/claude-mem.db

One-page Overview

Claude-Mem inserts five life-cycle hooks into Claude Code, captures every tool call, compresses it into observations, and stores them in SQLite plus Chroma vector search. A local worker exposes ten HTTP end-points; you query in plain English and receive progressively disclosed context. Installation takes three commands, token overhead is ~15 %, and privacy tags keep secrets out of the index. Memory is project-scoped for now, with team sharing and multimedia on the road-map.


FAQ

  1. Does it run on Windows?
    Yes—provided Node ≥ 18 is first in PATH.

  2. Where is data stored?
    ~/.claude-mem/data by default; portable via a single SQLite file.

  3. How do I nuke all memory?
    Delete the data directory while Claude Code is off.

  4. Can secrets leak?
    Anything inside <private> tags or matching maskPattern is never persisted.

  5. What if port 37777 is taken?
    Change workerPort in settings.json and restart.

  6. Is memory shared across projects?
    Not yet—each folder gets its own DB.

  7. How big can the DB grow?
    SQLite practical limit is TBs; typical projects see a few hundred MB per year.

  8. Can I downgrade after switching to beta?
    Yes—use the web UI toggle; schema rollbacks are automatic.



Image source: Unsplash

Exit mobile version