Claude-Mem Review: Persistent Memory for Claude Code

TL;DR

Claude-Mem is a Claude Code plugin that automatically captures everything Claude does during your coding sessions, compresses it with Claude’s Agent SDK, and injects relevant context back into future sessions. It’s the #3 trending AI repo on GitHub this week with 10,779 new stars (and ~45K+ cumulative), and it plugs directly into Claude Code, Gemini CLI, and OpenCode. Key facts:

Plugin for Claude Code, Gemini CLI, and OpenCode — one-command install via npx claude-mem install
5 lifecycle hooks capture SessionStart, UserPromptSubmit, PostToolUse, Stop, and SessionEnd
Local HTTP worker on port 37777 with a web viewer UI and 10 search endpoints
SQLite + Chroma hybrid search — keyword (FTS5) plus semantic vectors
3 MCP search tools (search, timeline, get_observations) for ~10× token savings over naive context loading
AGPL-3.0 license — free to use, modifications must stay open
Version 6.5.0, Node ≥18, auto-installs Bun and uv if missing
Beta “Endless Mode” — biomimetic memory architecture for very long sessions
Real concern: GitHub Issue #618 documents users burning their 5-hour token budget in <10 messages after enabling it

If you run Claude Code against the same project for weeks, Claude-Mem fixes the amnesia. If you’re on a Pro plan with tight session budgets, read the limitations section before installing.

What Problem Does Claude-Mem Actually Solve?

Every Claude Code user eventually hits the same wall. You spend 90 minutes teaching the agent your codebase — the weird auth flow, the reason utils/legacy.ts exists, the fact that you renamed the User table to Account three weeks ago. Then the session ends, context-compacts, or you /clear, and you start from zero.

The community has tried a few answers:

Manual CLAUDE.md files — works for stable conventions, useless for “what did we actually do yesterday”
MCP memory servers (mcp-server-memory, mem0) — require explicit calls, Claude usually forgets to use them
Compact-at-end hooks — brittle, one-shot, lose fidelity
Just re-read files — the default, expensive in tokens and slow

Claude-Mem takes a different approach: it hooks every tool call and user prompt, extracts observations in the background, and re-injects a prioritized slice into every new session — automatically, without Claude having to “remember to remember.”

The author, @thedotmack, describes it as “a 1-line-install memory system for Claude Code that prevents context loss between sessions.” The Reddit post that kicked off its rise put it in plainer terms: “Found an open-source tool (Claude-Mem) that gives Claude Persistent Memory via SQLite and reduces token usage by 95%” on long-running tasks (r/ClaudeAI, 2025-12-15).

Install in 60 Seconds

The quick start is genuinely one line:

npx claude-mem install

That’s it. The installer:

Checks Node ≥18 and installs Bun + uv if missing (the Python package manager, needed for Chroma)
Registers Claude Code plugin hooks in ~/.claude/plugins/
Creates a data directory at ~/.claude-mem/ with defaults in settings.json
Starts the worker service on port 37777

For Gemini CLI or OpenCode, pass the IDE flag:

npx claude-mem install --ide gemini-cli
npx claude-mem install --ide opencode

If you prefer Claude Code’s native plugin marketplace:

/plugin marketplace add thedotmack/claude-mem
/plugin install claude-mem

Then restart Claude Code. You’ll see an extra block at the top of your next session: previous observations relevant to what you’re working on.

Warning: npm install -g claude-mem installs the SDK only. It does not register the plugin hooks or start the worker. Always use npx claude-mem install.

Verifying the install

Open http://localhost:37777 in a browser. You should see the web viewer — a real-time stream of captured observations with filters by type, session, and project. This is also where you switch between the stable and beta release channels.

How It Works (Under the Hood)

Claude-Mem is not one big agent. It’s a tight pipeline of small components:

1. Five lifecycle hooks (plus one pre-hook for smart install):

Hook	Fires when	What it does
`SessionStart`	New Claude Code session	Queries SQLite + Chroma for relevant context, injects it
`UserPromptSubmit`	User sends a message	Tags the observation with user intent
`PostToolUse`	After any tool call	Captures file reads, edits, bash output
`Stop`	Claude finishes turn	Flushes buffer to worker
`SessionEnd`	Session terminates	Runs background summarization via Agent SDK

2. Worker service — a Bun-managed HTTP API on port 37777 that mediates between hooks and storage. It exposes 10 endpoints including /api/observation/{id} for citation lookups and /api/stream for live observation feeds to Telegram, Discord, or Slack.

3. SQLite database with FTS5 for keyword search — stores sessions, observations, summaries.

4. Chroma vector database for semantic search. The hybrid ranking combines FTS5 hits with vector similarity, then de-duplicates.

5. mem-search skill — a Claude Code skill using progressive disclosure. You (or Claude) query in natural language, get back a compact index first, then drill into full observations only for IDs that look promising.

The progressive-disclosure philosophy is why Claude-Mem’s authors claim ~10× token savings versus naive “load all history” memory systems.

The 3-Layer MCP Search Pattern

Claude-Mem exposes three MCP tools, and the order matters:

// Step 1: Get a compact index (~50–100 tokens per result)
search({
  query: "authentication bug",
  type: "bugfix",
  limit: 10
})

// Step 2: Optionally zoom in on chronological context
timeline({
  around_id: 456,
  window: 20
})

// Step 3: Only now fetch full details (~500–1,000 tokens per result)
get_observations({
  ids: [123, 456]
})

This is the same pattern ripgrep --files-with-matches → ripgrep → cat follows, and for the same reason: you pay the big cost only once you’ve narrowed down.

A typical Claude-Mem session logs 40–200 observations. Naive retrieval of all of them could cost 100K+ tokens. Indexed search + selective fetch usually fits in <5K tokens while returning the same decisions, gotchas, and file references.

A Real Example: Resuming a Refactor

Here’s an abbreviated, realistic flow. Yesterday you refactored src/auth/session.ts and decided to switch from JWT to opaque tokens. Today you open Claude Code in the same repo:

[claude-mem] Injected 6 observations from 2 prior sessions:

#412 [decision] 2026-04-19 — Switched session storage from JWT to opaque tokens
  File: src/auth/session.ts
  Reason: Needed immediate revocation; JWT stateless-ness was a liability
  Related: #408 (token TTL discussion), #410 (db migration)

#410 [migration] Added `sessions` table with (id, user_id, expires_at, revoked_at)
#408 [gotcha] Opaque tokens must be hashed before storage — used argon2id
#405 [bugfix] Fixed race condition in cleanup worker (see tests/auth/cleanup.test.ts)
...

Now you ask: “Finish the middleware migration we started.” Claude has the context without you narrating it, and hasn’t re-read 30 files to rediscover it.

Want to go deeper? Use the skill:

/skill mem-search "everything related to rate limiting decisions last month"

Or from Claude Desktop (supported separately), search across projects:

mem-search "when did we decide to drop Redis for tokens"

Community Reactions

Claude-Mem’s rise has been loud. A few representative voices:

From the author’s own celebratory Reddit thread (45K stars milestone): “Anytime it uses observation instead of relearning something… that’s basically 8× more efficient to run off of Claude-mem observations than it is to run off of new file reads.”
Enthusiastic user: “A 885-line Python file goes from 8,436 tokens to 542 tokens. That’s 93.6% fewer tokens, and Claude still gets all the signal: class names, function signatures, docstrings, line numbers.” (r/ClaudeAI)
Cautious adopter: “I like the idea, but it ended up burning more tokens so I took it out. Looking forward to trying it again though — good persistent memory is still unsolved.”
Skeptic: “That’s my worry and the reason I haven’t tried it yet.” — the single most-upvoted reply on that same thread.

Google Trends backs up the interest: “claude managed agents” is up +950% in the last 7 days, and the entire top 10 of GitHub Trending this week is agent infrastructure (BuilderPulse 2026-04-16). Claude-Mem and multica sit side-by-side at ~10.7K weekly stars each.

Honest Limitations

The enthusiasm is real. So are the problems.

1. It can burn your token budget (Issue #618)

The most damaging open issue, #618 “Uses too much tokens”, is not a theoretical complaint. A user on a medium-sized project reports: “Claude Code consumes all my tokens in <10 messages. It wasn’t like that before.” A separate Reddit post — “Hit 5-hour usage limit in a SINGLE session with ~140k tokens” — cites Claude-Mem as the cause.

The mechanism is easy to reason about: every PostToolUse hook fires the Agent SDK to summarize, and SessionStart injects a slab of context you may not need. On a Pro plan with 5-hour rolling limits, this compounds fast.

Workarounds users report working:

Set CLAUDE_MEM_MODE to a less aggressive mode
Disable PostToolUse capture for read-only tools in settings
Use it only on projects where you actually span sessions — not one-off scripts

2. Install complexity is creeping

Claude-Mem auto-installs Bun, uv, and SQLite, and it runs a persistent HTTP server on port 37777. On a managed machine (corporate laptop, shared dev box) this is a real amount of “stuff.” If port 37777 is already in use, you’ll need to change it in ~/.claude-mem/settings.json.

3. AGPL-3.0 license

The license is AGPL-3.0, not MIT or Apache. For personal use this is fine. If you’re building a closed-source product that interacts with a Claude-Mem worker over the network, AGPL’s “network copyleft” clause may apply. Read the license before integrating into a commercial product.

4. Privacy of captured data

Claude-Mem captures everything — including file contents, bash output, and prompts. It stores it locally in ~/.claude-mem/, but it also sends summaries to Anthropic’s API during compression. There’s a <private> tag to exclude blocks from capture, but you have to remember to use it. If you work with secrets in-terminal, audit your settings.

5. Windows support is rough

The README has a “Windows Setup Notes” section starting with npm : The term 'npm' is not recognized as the name of a cmdlet. WSL2 is the recommended path.

Claude-Mem vs. Alternatives

There’s a small cluster of memory tools for Claude Code, all launched in the last six months:

Tool	Approach	License	Search
Claude-Mem	Automatic via hooks, SQLite + Chroma	AGPL-3.0	FTS5 + vector hybrid
claude-memory-compiler (coleam00)	Daily log compiled by Agent SDK	MIT	Structured knowledge articles
mem0	Manual MCP calls	Apache-2.0	Vector-only
Official Anthropic “memory” feature	Built-in, opaque	Proprietary	Unknown

The closest comparison is claude-memory-compiler by Cole Medin. It runs a similar capture-and-compress pipeline but organizes output into “knowledge articles” inspired by Karpathy’s LLM Knowledge Base architecture. It’s MIT licensed and lighter-weight, but has no Chroma vector search — just structured markdown.

If you want zero manual work and semantic search, Claude-Mem is still the most complete option today. If you want something simple you can audit end-to-end in an afternoon, claude-memory-compiler is easier to reason about.

Should You Install It?

Install Claude-Mem if:

You run Claude Code against the same repo for weeks or months
Your sessions routinely end with “I wish I could remember what we did last time”
You’re on an API plan (not a Pro subscription with rolling limits)
You’re comfortable with AGPL-3.0 and a local HTTP daemon

Skip it (for now) if:

You’re on Claude Pro with tight 5-hour windows
Your work is short-form, one-off scripts
You can’t audit what Claude-Mem sends to Anthropic during compression
You’re on Windows without WSL2

A pragmatic compromise: install it, set CLAUDE_MEM_MODE conservatively, monitor token usage for a week via the web viewer, and uninstall if the math doesn’t work out for your plan.

FAQ

Is Claude-Mem free?

Yes. The plugin itself is AGPL-3.0 licensed and free to install via npx claude-mem install. It does use your Anthropic API credits (or Pro/Max quota) during the background compression step, which is where reports of heavy token burn come from. There’s no paid tier or SaaS version.

Does Claude-Mem send my code to a third party?

Your code stays on disk in ~/.claude-mem/ (SQLite + Chroma). However, Claude-Mem uses the Claude Agent SDK to compress and summarize observations — those calls go to Anthropic’s API just like any Claude Code request. If you work on sensitive code, wrap secrets in <private> tags so they’re excluded from capture, and review the Anthropic data retention policy.

Can I use Claude-Mem with Gemini CLI or OpenCode?

Yes. Use npx claude-mem install --ide gemini-cli or --ide opencode. The hook contract is the same; only the IDE integration points differ. Note that summarization still runs through Claude’s Agent SDK by default — configure an alternative AI provider in ~/.claude-mem/settings.json if you want Gemini to handle compression too.

What’s Endless Mode?

Endless Mode is an experimental beta feature implementing a “biomimetic memory architecture” for very long sessions. Instead of re-reading files to refresh context, Claude consults compressed observations from the current session itself. The author claims ~95% token reduction on long-running tasks. It’s opt-in via the web viewer UI at http://localhost:37777 → Settings → Beta Channel. Expect rough edges.

How do I uninstall Claude-Mem cleanly?

Run npx claude-mem uninstall — it removes the plugin hooks, stops the worker service, and leaves your data directory (~/.claude-mem/) intact in case you want to re-install later. Delete that directory manually if you want a full wipe. Port 37777 will be free again immediately.

Conclusion

Claude-Mem is the most complete answer yet to Claude Code’s amnesia problem, and its 45K+ star count is earned. The hook-based architecture, the 3-layer MCP search pattern, and the local-first SQLite + Chroma storage all reflect serious thought about context engineering.

But the tool is not free in the way that matters most — API tokens. Issue #618 is not a hypothetical. Before you install, decide whether your usage pattern (long-running project, API-based billing) matches the tool’s cost profile. If it does, Claude-Mem is probably the best thing you’ll install this week. If you’re on a Pro plan hammering tight session limits, wait for the token-efficiency work on the main branch to land.

Either way, Claude-Mem is a signal. Agent infrastructure is being built in the open, and persistent memory is the next hard problem to fall.

Links:

thedotmack/claude-mem
Official docs
Issue #618 — token usage
coleam00/claude-memory-compiler — alternative