TL;DR
CodeGraph is an open-source MCP server from Colby McHenry that gives AI coding agents a pre-indexed, AST-based knowledge graph of your codebase. It’s currently the #2 repo on GitHub Trending this week — 31,090 stars total, 21,424 gained in seven days. Highlights:
- MCP-native — auto-configures Claude Code, Cursor, Codex CLI, opencode, Hermes Agent, Gemini CLI, Antigravity, and Kiro
- Tree-sitter AST extraction across 20+ languages — symbols, call graphs, import chains, and references stored in local SQLite
- No embeddings, no vector DB, no API keys — pure structural graph + FTS5 full-text search
- Auto-syncing via native FSEvents/inotify with debounced re-indexing
- 35% cheaper, 57% fewer tokens, 46% faster, 71% fewer tool calls in published median-of-4 benchmarks on Claude Opus 4.7 across 7 real codebases
- Framework-aware routing for 14 web frameworks (Django, Flask, FastAPI, Express, NestJS, Laravel, Rails, Spring, ASP.NET, etc.)
- Cross-language bridging for Swift ↔ ObjC and React Native (legacy bridge, TurboModules, Fabric, Expo)
- Apache 2.0, bundles its own runtime, one-command install on macOS, Linux, or Windows
If Claude Context is the vector-DB answer to “stop my AI agent from grepping the same files 50 times,” CodeGraph is the structural answer — and the two are looking like complementary halves of the same problem.
Quick Reference
| Repository | github.com/colbymchenry/codegraph |
| License | Apache 2.0 |
| Language | TypeScript |
| Author | Colby McHenry |
| Stars | 31,090 (+21,424 this week) |
| Install | curl -fsSL https://raw.githubusercontent.com/colbymchenry/codegraph/main/install.sh | sh |
| NPM | @colbymchenry/codegraph (also works via npx) |
| Storage | Local SQLite, FTS5 full-text search |
| Requires | Nothing (bundled runtime) — or Node ≥ 18 for npm install |
What Is CodeGraph?
When Claude Code answers an architecture question — “how does X talk to Y here?” — it doesn’t actually know your codebase. It launches Explore sub-agents that fan out across grep, glob, and Read, follow imports, re-read the same files, and spend most of their token budget on discovery before they can answer.
CodeGraph attacks that discovery cost by pre-computing what those sub-agents would otherwise have to learn from scratch. It parses your repo with tree-sitter, extracts every symbol, call site, import, and reference, and stores them as nodes and edges in a local SQLite database. An MCP server exposes that graph through three primary tools: codegraph_context, codegraph_explore, and codegraph_status.
No embeddings, no API keys, no Docker, no vector store. The SQLite file lives in .codegraph/. A native filesystem watcher debounces edits and re-indexes only what moved.
The pitch: the agent already pays for tool calls — make those tool calls answer with structure instead of bytes.
Why It’s Trending Now
Three things converged this week:
1. Reproducible benchmarks. The README’s headline numbers — 35% cheaper, 57% fewer tokens, 46% faster, 71% fewer tool calls — are eye-catching, but what convinced Hacker News was the methodology: claude -p Opus 4.7 headless, --strict-mcp-config, 4 runs per arm, median reported, raw per-repo numbers published. Reproducible, not marketing.
2. Structural counter-pitch to vector search. A month after Claude Context brought BM25 + embeddings to MCP, CodeGraph argues you don’t need either. Symbol graphs are deterministic, lossless, and don’t drift when you rename a function. For “who calls processOrder?”, a graph answers in one query. Embeddings have to guess.
3. Zero infrastructure. No vector DB, no embedding API, no API keys. codegraph init -i and the MCP server wires into every coding agent on your machine.
How It Works
CodeGraph is three layers stacked on top of tree-sitter:
Parse layer. Tree-sitter grammars produce ASTs for 20+ languages — TypeScript, JavaScript, Python, Go, Rust, Java, C#, PHP, Ruby, C, C++, Objective-C, Swift, Kotlin, Dart, Lua, Luau, Svelte, Liquid, and Pascal/Delphi. CodeGraph walks each AST and extracts symbol declarations, references, imports, exports, and call sites.
Graph layer. Symbols become nodes; calls, references, and imports become edges. Stored in SQLite with an FTS5 virtual table for name search. On top of that, CodeGraph layers two enrichments:
-
Framework-aware route recognition for 14 web frameworks. Django
path(), FastAPI@router.get(...), Expressrouter.post(...), NestJS@Controller+@Get, Railsget '/x', to: 'users#index', Spring@GetMapping, ASP.NET[HttpGet]— each becomes a route node linked to its handler. “Who handles/api/orders?” now jumps straight to the controller. -
Cross-language bridging for iOS / React Native / Expo. Swift ↔ ObjC
@objcauto-bridging, JSNativeModules.X.fn(...)linked to ObjCRCT_EXPORT_METHODor Java/Kotlin@ReactMethod, Fabric components, TurboModule specs, native → JS event emitters, and Expo’sModule { Name("X"); AsyncFunction("fn") }DSL. The kind of thing static parsers normally drop on the floor.
MCP layer. A Node server speaks MCP and exposes three primary tools: codegraph_context(area) (entry points + related symbols), codegraph_explore(symbol) (full source plus immediate neighbors), and codegraph_status (pending edits, freshness banner).
The Benchmark, In Detail
The README’s benchmark is the most-discussed part of the project. Here’s the raw shape (medians of 4 runs per arm, Claude Opus 4.7 headless, --strict-mcp-config):
| Codebase | Language · Files | Cost WITH → WITHOUT | Tokens | Time | Tool calls |
|---|---|---|---|---|---|
| VS Code | TS · ~10k | $0.60 → $0.80 | 601k → 2.8M | 1m 10s → 2m 26s | 8 → 55 |
| Excalidraw | TS · ~640 | $0.43 → $0.90 | 344k → 3.5M | 48s → 2m 58s | 3 → 79 |
| Django | Py · ~3k | $0.59 → $0.67 | 739k → 1.2M | 1m 19s → 1m 38s | 9 → 19 |
| Tokio | Rust · ~790 | $0.42 → $2.41 | 379k → 2.6M | 53s → 3m 2s | 4 → 53 |
| OkHttp | Java · ~645 | $0.47 → $0.47 | 636k → 730k | 42s → 1m 1s | 6 → 11 |
| Gin | Go · ~110 | $0.37 → $0.47 | 444k → 675k | 44s → 1m 0s | 6 → 10 |
| Alamofire | Swift · ~110 | $0.61 → $1.14 | 1.0M → 2.8M | 1m 17s → 2m 27s | 12 → 69 |
Three things stand out:
- Gains scale with codebase size. On VS Code (~10k files) the no-CodeGraph arm needs 55 tool calls and reads 2.8M tokens. On Gin (~110 files), native
grepis already cheap and CodeGraph’s edge collapses to 21% cheaper. ROI is real around the 1k-file mark, dramatic above 5k. - Tool calls drop harder than tokens. On Excalidraw it’s 3 vs 79 — a 96% reduction. The WITHOUT arm spawns Explore sub-agents that themselves read files, multiplying calls. CodeGraph short-circuits the tree at the parent.
- OkHttp is the honest outlier. 2% cheaper, tokens barely moved. Its query hits a small, localized part of the code where
grepwas already efficient. Not every question rewards a graph.
The author’s own caveat is healthy: CodeGraph only helps when queried directly — if the parent agent delegates exploration to a file-reading sub-agent, the graph never gets called and becomes overhead. The system prompt shim matters as much as the index.
Getting Started
The installer is one command:
# macOS / Linux
curl -fsSL https://raw.githubusercontent.com/colbymchenry/codegraph/main/install.sh | sh
# Windows PowerShell
irm https://raw.githubusercontent.com/colbymchenry/codegraph/main/install.ps1 | iex
# Or via npm
npx @colbymchenry/codegraph
Then, in your project root:
cd your-project
codegraph init -i
The -i flag launches an interactive installer that detects every coding agent on your system — Claude Code, Cursor, Codex CLI, opencode, Hermes Agent, Gemini CLI, Antigravity, Kiro — and writes the MCP config and instruction shim into each one you select. No manual JSON editing.
Open Claude Code and start asking architecture questions. The first session triggers the initial index (seconds for small repos, minutes for VS Code-scale). The filesystem watcher keeps it current after that.
To uninstall: codegraph uninstall strips MCP config from every agent it touched; codegraph uninit removes .codegraph/ from the project. Cleanest uninstall story in the MCP ecosystem.
Real-World Use Cases
- Onboarding to a large codebase. Drop CodeGraph on a 100k-line monorepo, ask “how does authentication flow work end to end?” — get a routed answer hitting middleware, the JWT verifier, and the user store in one tool call.
- Refactor impact analysis. “What breaks if I change the signature of
processPayment?” Onecodegraph_explorecall returns every caller and callee. - Cross-language iOS apps. Swift ↔ ObjC and React Native bridge support means “where does this JS prop end up on the native side?” actually resolves across the boundary.
- Cutting Claude/OpenAI API spend. Reddit reports in r/ClaudeCode put the saving at 30–50% on long sessions, consistent with the README’s 35% median.
- Auto-fresh long sessions. The file watcher debounces and re-indexes, so multi-hour agent sessions don’t drift from the working tree.
First Impressions From the Community
Reception on Hacker News and r/ClaudeCode this week has been warm — partly the reproducible benchmark, partly the painless install:
“The MCP config auto-writes into every agent on your machine in one shot.
codegraph init -iand Claude Code suddenly stops grepping.” — r/ClaudeCode
“Symbol graph beats embeddings for ‘who calls this?’ questions. Embeddings are fuzzy by design. CodeGraph just knows.” — Hacker News
“On a 200k-line legacy Java service we cut Claude Code’s average session cost from $4 to $1.50.” — Reddit testimonial
The common gripe is the converse: for fuzzy semantic questions (“find the place that probably handles edge cases in checkout”), a symbol graph isn’t as good as a vector store. Several commenters already run CodeGraph and Claude Context side by side.
Honest Limitations
CodeGraph is impressive, but worth knowing before you bet on it:
- Symbol graphs don’t help with fuzzy questions. If you don’t know what the symbol is called, the graph can’t find it. Vector search degrades more gracefully here.
- First-index time scales with repo size. A 10k-file TS repo takes a couple of minutes to parse. Incremental after that, but the initial wait is real.
- Tree-sitter coverage varies. Top-tier languages (TS, JS, Python, Go, Rust, Java) are excellent. Pascal/Delphi and Liquid work but with thinner symbol coverage. Anything outside the 20+ list falls back to FTS5 text search.
- Benchmark is one question per repo. Real sessions ask many questions; some graph queries handle worse than
grep. Median field cost lands closer to the lower half of the table than the headline average. - No multi-repo workspace yet. Index lives per project. Microservices repos mean multiple
.codegraph/directories with no cross-repo query.
Who Should Use This (And Who Shouldn’t)
Use CodeGraph if:
- You work on a 1k+ file codebase and your AI agent burns tokens on discovery
- You want zero infrastructure — no embedding API, no vector DB, no Docker
- You hop between Claude Code, Cursor, and Codex CLI and want one index across all of them
- You work on iOS / React Native and lose context at the bridge
Skip CodeGraph if:
- Your repo is under ~300 files (native
grepis fast enough) - Your questions are mostly semantic (“find the place that handles X” without knowing the symbol name) — Claude Context fits better
- You can’t run a local SQLite file or are in a sandboxed environment with no filesystem watcher (
CODEGRAPH_NO_DAEMON=1works, but you’ll need manualcodegraph sync)
CodeGraph vs. Claude Context vs. Other Indexers
The MCP code-search space has crystallized into two distinct approaches:
| Tool | Approach | Storage | Local-only | MCP | Multi-client | Best for |
|---|---|---|---|---|---|---|
| CodeGraph | AST symbol graph | SQLite | ✅ always | ✅ | ✅ 8+ | Structural questions, refactors |
| Claude Context | Hybrid BM25 + embeddings | Milvus / Zilliz | ⚠️ via Ollama | ✅ | ✅ 13+ | Semantic questions, vague queries |
| Cursor Codebase Index | Embeddings | Cursor cloud | ❌ | ❌ | ❌ Cursor only | Cursor users |
| Aider repo-map | Tree-sitter graph | In-memory | ✅ | ❌ | ❌ Aider only | Aider users |
| Sourcegraph Cody | Hybrid + graph | Sourcegraph | ✅ enterprise | ❌ | ❌ | Enterprise |
| Continue @codebase | Embeddings | LanceDB | ✅ | ❌ | ❌ Continue only | Continue users |
Symbol graphs vs. embeddings is not a winner-take-all fight. The two answer different question shapes. CodeGraph nails “who calls X?”, “what’s the route for /api/orders?”, “what breaks if I rename Y?”. Claude Context nails “find the place that handles the corner case where users have two emails.” Several commenters this week are running both — the graph as the structural source of truth, the vector store for fuzzy recall.
If you only have time to add one MCP server this week and your codebase is over 1k files: CodeGraph is the lower-friction install (no API keys, no Docker) and lands the bigger token reduction on architecture questions, which is what most agents waste budget on.
FAQ
Does CodeGraph work with Cursor and Codex CLI, or only Claude Code?
It auto-configures eight clients: Claude Code, Cursor, Codex CLI, opencode, Hermes Agent, Gemini CLI, Antigravity IDE, and Kiro. The interactive installer (codegraph init -i) detects which are present and lets you choose. The MCP server itself is client-agnostic — anything that speaks MCP can connect.
How does CodeGraph compare to Claude Context (Zilliz’s MCP indexer)?
CodeGraph uses a tree-sitter symbol graph in local SQLite. Claude Context uses BM25 + dense embeddings in Milvus. CodeGraph wins on structural questions (“who calls X?”, “what’s the route for Y?”) and zero-infrastructure setup. Claude Context wins on fuzzy semantic questions and recall when you don’t know the symbol name. They’re complementary, and several teams run both.
Is CodeGraph really 100% local?
Yes. No API keys, no embeddings, no external services. The graph is a SQLite database in .codegraph/ inside your project. The MCP server runs as a local Node process. Nothing leaves your machine.
Do I need Node.js installed?
No. The native installer (install.sh / install.ps1) bundles its own runtime. If you already have Node, npx @colbymchenry/codegraph works too — both paths land at the same binary.
How does the auto-sync work? Do I need to run codegraph sync manually?
You don’t. A native filesystem watcher (FSEvents on macOS, inotify on Linux, ReadDirectoryChangesW on Windows) catches every file change and re-indexes after a 2-second debounce (tunable). On reconnect, the MCP server does a fast (size, mtime) + content-hash reconciliation. Manual codegraph sync only matters in sandboxed environments where the watcher is disabled.
Are the benchmark numbers reproducible?
Yes. The README publishes the methodology (claude -p Opus 4.7 headless, --strict-mcp-config, 4 runs per arm, median reported), the exact query for each of the 7 repos, and the raw WITH → WITHOUT medians per cell. You can clone any of the benchmark repos at --depth 1 and run the same comparison yourself.
Bottom Line
CodeGraph is the strongest pitch yet for symbol graphs as the structural layer beneath AI coding agents. The benchmark is reproducible, the install story is the lowest friction in MCP code search, and the 21,424 stars in seven days suggest a lot of developers had the same thought: I’m tired of watching Claude Code re-grep the same files.
If your repo is over 1,000 files and you’re paying for an agent’s tool calls, CodeGraph likely pays for itself in this week’s Claude bill. Run it alongside Claude Context for fuzzy recall and you have the closest thing to a complete MCP code-intelligence stack that exists today.
Repo: github.com/colbymchenry/codegraph — Apache 2.0 licensed, 31K stars and gaining ~3K/day.