LangChain Deep Agents Review: Open-Source Agent Harness

TL;DR

LangChain Deep Agents is an open-source agent harness that gives you a production-ready autonomous agent out of the box — with planning, subagents, a virtual filesystem, and smart context management. Key highlights:

MIT licensed, built on LangGraph — fully open-source and extensible
v0.5 just shipped with async subagents, multimodal filesystem support
~42.65% on TerminalBench 2.0 using Claude Sonnet 4.5 — on par with Claude Code at the same model tier
Model-agnostic — works with any LLM that supports tool calling (OpenAI, Anthropic, open models)
Planning tools — built-in write_todos for task decomposition and progress tracking
Subagent delegation — spawn isolated agents for parallel work with separate context windows
Virtual filesystem — offload artifacts to storage instead of burning context tokens
CLI included — terminal coding agent with interactive TUI, headless mode for CI/CD
MCP support — via langchain-mcp-adapters for tool integration

Install: pip install deepagents

Think of it as LangChain’s answer to Claude Code — but model-neutral, fully open-source, and designed to be embedded in your own applications.

Quick Reference

Detail	Info
Repository	langchain-ai/deepagents
License	MIT
Language	Python (JS version also available)
Latest Version	v0.5
Install	`pip install deepagents` or `uv add deepagents`
CLI Install	`curl -LsSf https://raw.githubusercontent.com/langchain-ai/deepagents/main/libs/cli/scripts/install.sh \| bash`
Built On	LangGraph
MCP Support	Yes (via langchain-mcp-adapters)
Docs	docs.langchain.com

What Is Deep Agents?

Deep Agents is what LangChain calls an “agent harness” — a distinction that matters. Unlike LangGraph (which gives you low-level primitives to build agents from scratch), Deep Agents gives you a working autonomous agent immediately. You install it, point it at a model, and it can plan tasks, read/write files, execute shell commands, and spawn sub-agents — all out of the box.

The project was explicitly inspired by Claude Code. From the README:

This project was primarily inspired by Claude Code, and initially was largely an attempt to see what made Claude Code general purpose, and make it even more so.

The key difference: Deep Agents doesn’t lock you into any specific model or cloud provider. You can run it with GPT-4o, Claude, Gemini, Llama, or any other model that supports tool calling. Since it returns a compiled LangGraph graph, you get streaming, persistence, checkpointing, and the entire LangGraph ecosystem for free.

Why Deep Agents Matters Right Now

The coding agent space is exploding. We’ve got Claude Code, OpenAI Codex, Cursor, and a growing list of open-source alternatives. So why does Deep Agents deserve attention?

1. It’s the first major open-source harness from a framework company

LangChain isn’t a startup building a single product — they’re the most widely-used LLM framework. Deep Agents is their opinionated take on what a production agent should look like, built on years of seeing what works and what doesn’t in LangGraph deployments.

2. Model neutrality is actually useful

Most coding agents are tied to specific providers. Claude Code needs Anthropic. Codex needs OpenAI. Deep Agents lets you swap models per task — use a cheap model for routine subagent work, a capable model for the planner, and your favorite local model for sensitive code.

3. Async subagents change the game

The v0.5 release (shipped last week) introduced async subagents — background agents that run on remote servers while the main agent continues working. This isn’t just a feature; it’s an architecture shift that enables heterogeneous deployments where different agents run on different hardware with different models.

Key Features Deep Dive

Planning with `write_todos`

Instead of trying to hold an entire complex task in its context window, Deep Agents breaks work down explicitly:

from deepagents import create_deep_agent

agent = create_deep_agent(model="anthropic:claude-sonnet-4-6")

# The agent automatically uses write_todos to decompose tasks
result = agent.invoke({
    "messages": [{
        "role": "user",
        "content": "Refactor the authentication module to use JWT tokens instead of sessions"
    }]
})

The agent writes a todo list, works through items sequentially, and marks them complete. This gives long-running tasks a recoverable state — if something fails, you know exactly where it stopped.

Subagent Delegation

Deep Agents can spawn isolated subagents for independent work. Each subagent gets its own context window, preventing token waste from unrelated history:

from deepagents import create_deep_agent, SubAgent

agent = create_deep_agent(
    model="anthropic:claude-sonnet-4-6",
    subagents=[
        SubAgent(
            name="test_writer",
            description="Writes unit tests for given code",
            model="openai:gpt-4o-mini",  # cheaper model for subtasks
        ),
    ],
)

With v0.5, you can also use AsyncSubAgent for non-blocking background work:

from deepagents import AsyncSubAgent, create_deep_agent

agent = create_deep_agent(
    model="anthropic:claude-sonnet-4-6",
    subagents=[
        AsyncSubAgent(
            name="researcher",
            description="Performs deep research on a topic.",
            url="https://my-agent-server.dev",
            graph_id="research_agent",
        ),
    ],
)

The main agent gets five tools for managing async work: start_async_task, check_async_task, update_async_task, cancel_async_task, and list_async_tasks. Multiple async subagents can run concurrently.

Virtual Filesystem

Large tool outputs and intermediate artifacts get offloaded to storage instead of staying in the active context. This is crucial for long-running tasks where context windows would otherwise overflow:

# The agent can read, write, and manage files
# Large outputs are automatically saved to the virtual filesystem
# Supports pluggable backends: S3, local disk, or in-memory

The v0.5 release expanded this to multimodal files — PDFs, images, and video can now be stored and referenced.

Context Management Middleware

Deep Agents includes automatic context management that:

Compresses conversation history when it gets too long
Offloads large tool results to the filesystem
Isolates context per subagent to prevent token waste
Uses prompt caching to reduce latency and cost

This middleware is what separates a toy agent from a production one. Without it, any agent working on a multi-hour task will inevitably exceed its context window and start losing information.

The CLI: A Terminal Coding Agent

Deep Agents ships with a terminal-based coding agent that rivals Claude Code and Codex CLI:

# Install the CLI
curl -LsSf https://raw.githubusercontent.com/langchain-ai/deepagents/main/libs/cli/scripts/install.sh | bash

# Or via pip
pip install deepagents-cli

The CLI features:

Interactive TUI built with Textual — rich terminal interface with streaming
Headless mode for scripting and CI/CD pipelines
Web search — ground responses in live information (requires Tavily API key)
All SDK features — subagents, persistent memory, custom skills, human-in-the-loop
ACP server mode — run the agent as an ACP-compatible server with --acp

TerminalBench Performance

The Deep Agents CLI scored ~42.65% on TerminalBench 2.0 using Claude Sonnet 4.5 — which puts it on par with Claude Code running the same model. That’s a significant result: it means the harness architecture (planning, subagents, context management) is competitive with purpose-built proprietary agents.

Getting Started

Basic Setup (5 Minutes)

# Install
pip install deepagents
# or
uv add deepagents

# Set your API key
export ANTHROPIC_API_KEY="your-key"
# or OPENAI_API_KEY, etc.

Your First Agent

from deepagents import create_deep_agent

# Create an agent with default tools
agent = create_deep_agent()

# Run it
result = agent.invoke({
    "messages": [{
        "role": "user",
        "content": "Research LangGraph and write a summary"
    }]
})

That’s it. The agent can plan, read/write files, execute shell commands, and manage its own context — all with zero configuration.

Custom Agent

from langchain.chat_models import init_chat_model
from deepagents import create_deep_agent

agent = create_deep_agent(
    model=init_chat_model("openai:gpt-4o"),
    tools=[my_custom_tool],
    system_prompt="You are a research assistant.",
)

MCP Integration

# Deep Agents supports MCP tools via langchain-mcp-adapters
# Connect to any MCP server for extended tool capabilities

Deep Agents vs Claude Code vs Codex

Feature	Deep Agents	Claude Code	Codex CLI
License	MIT	Proprietary	Open-source
Model Support	Any LLM	Claude only	OpenAI only
Planning	write_todos	Internal	Internal
Subagents	Sync + Async	No	No
Context Mgmt	Auto-summarize	Proprietary	Basic
MCP Support	Yes	Yes	No
Filesystem	Virtual + pluggable	Direct	Direct
Deployment	Anywhere	Anthropic	OpenAI
TerminalBench	~42.65%	~42% (same model)	~38%
Customizable	Fully	Limited	Limited

When to choose Deep Agents: You want model flexibility, need to embed agents in your own applications, want async subagent orchestration, or are already in the LangChain ecosystem.

When to choose Claude Code: You want the best single-agent coding experience and are committed to Anthropic’s models.

When to choose Codex: You’re in the OpenAI ecosystem and want tight GPT integration.

Who Should Use This (And Who Shouldn’t)

Great For:

Teams building agent-powered products — embed Deep Agents in your backend
Organizations with model flexibility requirements — switch providers without rewriting
Complex workflows — multi-step tasks needing planning and delegation
CI/CD automation — headless mode for automated coding pipelines
LangChain users — natural upgrade from raw LangGraph agent loops

Not Ideal For:

Simple chatbots — overkill for basic Q&A
One-off coding tasks — Claude Code or Cursor are faster for quick edits
Teams without Python expertise — JS version exists but Python is the primary
Those wanting zero configuration — still requires API keys and some setup

Honest Limitations

Young project — Deep Agents launched in March 2026. Documentation is solid but the ecosystem of community plugins and skills is still small compared to Claude Code.
LangChain dependency — You’re buying into the LangChain/LangGraph stack. If you prefer lighter frameworks, this adds weight.
Async subagents need infrastructure — The new async feature requires a remote agent server running Agent Protocol. It’s not plug-and-play yet.
CLI is good but not great — The TUI works well, but Claude Code’s terminal experience is more polished. Deep Agents CLI feels more like a developer tool than a daily driver.
Performance depends on model — The 42.65% TerminalBench score is with Claude Sonnet 4.5. Results with cheaper or open models will vary significantly.

What the Community Is Saying

The response on Reddit and Hacker News has been cautiously positive:

“Finally, a model-agnostic Claude Code” — the most common reaction, especially from teams locked into multi-provider strategies
“The async subagent architecture is what we’ve been building internally” — several teams report building similar patterns on raw LangGraph
“TerminalBench parity with Claude Code is impressive” — the benchmark result surprised many who expected proprietary tools to dominate
Some skepticism about whether another LangChain project will be maintained long-term, given the framework’s history of rapid iteration

FAQ

Is Deep Agents free to use?

Yes. Deep Agents is MIT licensed and fully open-source. You’ll need API keys for the LLM providers you use (Anthropic, OpenAI, etc.), but the framework itself is free. LangSmith integration for tracing and deployment is optional and has its own pricing.

Can I use Deep Agents with local models?

Yes. Any model that supports tool calling works. You can use Ollama-hosted models, vLLM, or any OpenAI-compatible endpoint. Performance will depend on the model’s tool-calling capability.

How does Deep Agents handle long-running tasks?

Through three mechanisms: (1) planning tools that break work into trackable steps, (2) subagents that isolate context per subtask, and (3) summarization middleware that compresses old conversation history. Together, these prevent the context window overflow that kills most agents on multi-hour tasks.

Can I deploy Deep Agents in production?

Yes. Since it returns a LangGraph graph, you can deploy with LangSmith, run it in a FastAPI server, or containerize it. The async subagent feature specifically targets production deployments where you need distributed agent orchestration.

What’s the difference between Deep Agents and LangGraph?

LangGraph gives you low-level graph primitives. Deep Agents is an opinionated layer on top — like the difference between Express.js and a full web framework. If you want to control every detail, use LangGraph directly. If you want a working agent fast, use Deep Agents.

Does Deep Agents support MCP?

Yes, through the langchain-mcp-adapters package. You can connect to any MCP server and use its tools within your Deep Agent.

Bottom Line

LangChain Deep Agents is the most complete open-source agent harness available right now. It won’t replace Claude Code for developers who want the best single-tool coding experience — but it fills a gap that nothing else does: a model-agnostic, embeddable, production-ready agent framework with real architectural depth.

The async subagent feature in v0.5 is particularly forward-looking. As agent workloads get more complex, the ability to orchestrate multiple specialized agents across different infrastructure becomes essential. Deep Agents is building for that future now.

If you’re building agent-powered products (not just using agents as tools), Deep Agents deserves a serious look.

Links: