TL;DR
DeepTutor is an open-source, agent-native personalized learning assistant from the HKU Data Intelligence Lab. After months of rewrites it just hit v1.0.0 (April 4, 2026) with a brand-new ground-up architecture. Highlights:
- Agent-native from the ground up — ~200k lines rewritten around a two-layer plugin model (Tools + Capabilities) with CLI, SDK, and web entry points
- Personal TutorBots — autonomous tutors with their own memory, personality, skills, and reminders (not stateless chatbots)
- Five modes, one thread — Chat, Deep Solve, Quiz Generation, Deep Research, and Math Animator all share the same context
- Knowledge Hubs — upload PDFs, Markdown, and notes; build RAG-ready color-coded notebooks that power every conversation
- AI Co-Writer + Guided Learning — Markdown editor with AI as first-class collaborator, plus structured multi-step study plans with interactive pages
- Persistent memory — builds a living profile of what you’ve studied and how you learn
- 16,000+ GitHub stars, Apache-2.0, ~4,700 stars added in the last week, hit 10K stars in just 39 days after release
Install in one command: python scripts/start_tour.py — the guided tour handles dependencies, provider setup, and connection testing without touching a .env file.
What Is DeepTutor?
DeepTutor is the education-focused cousin of the bigger “agent-native” wave. It comes out of HKUDS — the Data Intelligence Lab at the University of Hong Kong — the same group behind LightRAG, GraphGPT, RAG-Anything, and Vibe-Trading. The project was released on December 29, 2025, hit 10K stars in 39 days, and on April 4, 2026 shipped a major v1.0.0 rewrite that repositions it as an agent-native platform rather than a traditional RAG chatbot.
The pitch is simple: instead of yet another “ChatPDF” clone, DeepTutor treats your study materials, your conversations, and your personal AI tutors as first-class primitives in a unified workspace. You upload a textbook, spin up a TutorBot, ask a question in Chat mode, escalate to Deep Solve if it gets hard, generate a quiz from the same thread, then kick off Deep Research — all without losing context.
If you’ve been waiting for an open-source alternative to the closed education assistants (Khanmigo, NotebookLM, Heygen study tools), DeepTutor is the most complete open option right now.
The Five Modes (Shared Context)
DeepTutor’s core insight is that real studying isn’t a single-turn Q&A loop. You bounce between reading, solving, quizzing yourself, and researching. v1.0 bakes that into a unified chat workspace:
- Chat — regular conversation grounded in your knowledge hub
- Deep Solve — multi-agent step-by-step problem solving with precise citations (dual-loop reasoning architecture)
- Quiz Generation — custom quizzes from your notes, or mimic real exam styles
- Deep Research — systematic topic exploration with web search, paper retrieval, and literature synthesis
- Math Animator — renders animated math explanations (Chart.js/SVG pipeline added in v1.0.1)
All five share the same thread. Escalating from a casual question into a full Deep Research run takes one click — no copy-pasting between tools, no losing your place.
Personal TutorBots
This is the feature that makes DeepTutor more than “a better RAG chatbot.” Each TutorBot is an autonomous tutor with:
- Its own workspace, memory, and personality
- A skill set you can extend by adding
SKILL.mdfiles - A built-in Heartbeat system for recurring check-ins, review reminders, and scheduled tasks
- Full tool access — RAG retrieval, code execution, web search, academic paper search, deep reasoning
- Powered by nanobot, HKUDS’s lightweight agent runtime
Practically: you can create a “Linear Algebra Tutor” that remembers what you’ve covered, pings you every Tuesday to review eigenvalues, and gets smarter as you feed it new materials. It’s closer in spirit to OpenClaw subagents or Letta memory agents than to a ChatGPT custom GPT.
Installation — The Guided Tour
One of the nicest developer-experience decisions in v1.0 is the interactive installer. Instead of 30 steps of pip install, environment-variable wrangling, and frontend bundling, you run a single script:
git clone https://github.com/HKUDS/DeepTutor.git
cd DeepTutor
# Create a Python environment (3.11+ required as of v1.0.0-beta.2)
conda create -n deeptutor python=3.11 && conda activate deeptutor
# Or: python -m venv .venv && source .venv/bin/activate
# Launch the guided tour
python scripts/start_tour.py
The tour asks how you want to run it:
- Web mode — installs both
pipandnpmdependencies, spins up a temporary server, opens a Settings page in your browser, and walks you through LLM, embedding, and search provider setup with live connection testing. When you finish, DeepTutor restarts with your config applied. - CLI mode — a fully interactive terminal flow for headless servers.
Either way you end up with DeepTutor running at http://localhost:3782.
Manual install (if you want full control)
git clone https://github.com/HKUDS/DeepTutor.git
cd DeepTutor
conda create -n deeptutor python=3.11 && conda activate deeptutor
pip install -e ".[server]"
cd web && npm install && cd ..
cp .env.example .env
Then edit .env with at minimum an LLM and an embedding provider:
# LLM (required)
LLM_BINDING=openai
LLM_MODEL=gpt-4o-mini
LLM_API_KEY=sk-xxx
LLM_HOST=https://api.openai.com/v1
# Embedding (required for Knowledge Base / RAG)
EMBEDDING_BINDING=openai
EMBEDDING_MODEL=text-embedding-3-large
EMBEDDING_API_KEY=sk-xxx
EMBEDDING_HOST=https://api.openai.com/v1
EMBEDDING_DIMENSION=3072
DeepTutor supports an unusually wide range of providers out of the box: OpenAI, Anthropic, Azure OpenAI, AiHubMix, BytePlus, Ollama, vLLM, SiliconFlow, DeepSeek, and more. If you want fully local, pair Ollama (LLM) with a local embedding model — RAG still works end-to-end without any external API calls.
Docker Deployment
For servers, pre-built images ship on GHCR (added in v0.3.0 and kept current through v1.0.x):
docker pull ghcr.io/hkuds/deeptutor:latest
docker run -d \
--name deeptutor \
-p 3782:3782 \
-v $(pwd)/data:/app/data \
-e LLM_BINDING=openai \
-e LLM_API_KEY=sk-xxx \
ghcr.io/hkuds/deeptutor:latest
Point it at a persistent volume (/app/data) so your Knowledge Hubs, TutorBots, and memory survive container restarts.
Agent-Native CLI
The v1.0 rewrite added a first-class CLI where every capability is one command away. The output is rich for humans and structured JSON for agents:
# Create a knowledge base and ingest files
deeptutor kb create "Linear Algebra" --notes blue
deeptutor kb add "Linear Algebra" ./textbook.pdf ./notes.md
# Create a TutorBot
deeptutor bot create "Linear Algebra Tutor" \
--kb "Linear Algebra" \
--skill ./skills/socratic.md
# Talk to it from the shell
deeptutor chat "Explain the rank-nullity theorem with an example" \
--bot "Linear Algebra Tutor" --mode deep-solve
Because the CLI speaks JSON when asked, you can drop DeepTutor into an OpenClaw, Claude Code, or Codex pipeline as a tool. Hand it a SKILL.md and your existing agents can operate DeepTutor autonomously — scheduling quizzes, running research sweeps, and piping results back into your own workflow.
What’s New in v1.0 (and Why It Matters)
The release notes from the last week read like a small product team shipping daily, which is a good sign for an open-source project you’re evaluating:
- v1.0.0 (2026-04-04) — agent-native architecture rewrite (~200k lines), two-layer plugin model (Tools + Capabilities), TutorBot, Co-Writer, Guided Learning, persistent memory
- v1.0.0-beta.3 (2026-04-08) — dropped
litellmdependency in favor of native OpenAI/Anthropic SDK providers, robust JSON parsing for LLM outputs, full Chinese i18n - v1.0.1 (2026-04-10) — Visualize capability with Chart.js/SVG rendering, quiz duplicate prevention,
o4-minisupport, server logging improvements - v1.0.2 (2026-04-11) — search consolidation with SearXNG fallback, provider-switch bug fix, explicit runtime config in test runner, frontend resource-leak fixes
The move to native provider SDKs is the one I’d call out: litellm is convenient but a common source of tracebacks with less-common providers. Going native is a maturity signal.
Community Reaction
DeepTutor has been quietly racking up activity across the usual channels:
- 10,000 stars in 39 days — a rate normally reserved for launches backed by big-name companies
- ~4,700 stars this week according to GitHub Trending at the time of writing, keeping it near the top of the weekly Python charts
- Active Discord + WeChat communities plus a GitHub Discussions board; maintainers respond within hours on issues
- Praise on r/LocalLLaMA for being one of the few “agent-native” projects that isn’t just a thin wrapper on LangChain
- Criticism on Hacker News that the feature set is too ambitious — “five modes is three modes too many for v1” was a common take
My honest read: the breadth of features is a risk if you want a stable, narrow tool. But as an educational workspace, it’s easily the most complete open-source option I’ve tested this year.
Honest Limitations
DeepTutor is ambitious, and ambition has costs. After setting it up locally, here are the warts worth knowing:
- Python 3.11+ only as of v1.0.0-beta.2 — if you’re on 3.10, you’ll need a new environment
- Node + Python split — running the web UI means managing both
pipandnpmdependencies, which is more moving parts than a pure-Python project - RAM hungry on big knowledge bases — the unified memory + RAG pipeline can use 4-6 GB when you push several hundred PDFs through it
- Math Animator is hit-or-miss — great on clean algebraic content, shaky on dense proofs; the v1.0.1 Chart.js/SVG pipeline is a clear step up from the pre-1.0 renderer but still early
- Docs lag the release pace — they shipped four versions in the last week, and some CLI flags in the README still reference pre-1.0 behavior
- No hosted cloud option — you self-host or nothing; fine for developers, a barrier for non-technical students
- Apache-2.0, not AGPL — good news for embedding it in proprietary tools, but it means commercial forks are fair game
None of these are dealbreakers, but go in expecting a v1.0, not a v3.
DeepTutor vs NotebookLM vs AnythingLLM
Where does it sit in the landscape?
| Feature | DeepTutor | NotebookLM | AnythingLLM |
|---|---|---|---|
| Open source | ✅ Apache-2.0 | ❌ Closed | ✅ MIT |
| Self-hosted | ✅ | ❌ | ✅ |
| Persistent agents (bots) | ✅ TutorBots | ❌ | Partial |
| Multi-mode chat (quiz/solve/research) | ✅ 5 modes | Partial | ❌ |
| Knowledge base / RAG | ✅ | ✅ | ✅ |
| Math animation | ✅ | ❌ | ❌ |
| Guided Learning paths | ✅ | ❌ | ❌ |
| CLI / SDK for agent use | ✅ | ❌ | Partial |
| Ollama / local LLMs | ✅ | ❌ | ✅ |
If your priority is a polished, free, hosted study tool, NotebookLM still wins on UX. If you want a generic self-hosted RAG workspace for a company, AnythingLLM is the safer choice. But if you want a self-hosted, multi-mode learning workspace with persistent tutor agents — DeepTutor is the only realistic option today.
Who Should Use DeepTutor?
- Grad students and self-learners drowning in PDFs who want a study buddy that actually remembers them
- Teachers and bootcamps building personalized tutor experiences without paying per-seat SaaS fees
- Agent developers who want an education-shaped RAG+memory workspace they can plug into OpenClaw, Claude Code, or Codex pipelines
- Privacy-conscious learners who want a fully local pipeline with Ollama + local embeddings
Who should not use it yet: non-technical students, people who need a stable v3 product today, and teams that can’t afford the 4-6 GB RAM overhead on large knowledge bases.
FAQ
Is DeepTutor free? Yes. It’s Apache-2.0 licensed and fully open source. You only pay for the LLM and embedding providers you configure — and you can skip that entirely by pointing it at a local Ollama + local embedding model.
Can I run DeepTutor fully offline?
Yes. Configure LLM_BINDING=ollama and a local embedding model (or use a self-hosted embedding server). Web search is optional and can be disabled, and v1.0.2 added a SearXNG fallback if you want a privacy-respecting self-hosted search provider.
What’s the difference between DeepTutor and NotebookLM? DeepTutor is open source, self-hosted, multi-mode (Chat, Deep Solve, Quiz, Deep Research, Math Animator), and has persistent TutorBots with their own memory. NotebookLM is a closed Google product with a nicer UI but no agent framework, no CLI, no offline mode, and no custom skills.
Does DeepTutor work with Claude or Anthropic models?
Yes. As of v1.0.0-beta.3, Anthropic is a first-class native provider (no litellm in between). Set LLM_BINDING=anthropic and your LLM_MODEL to any supported Claude model.
How is DeepTutor different from LangChain or LlamaIndex? LangChain and LlamaIndex are libraries; DeepTutor is an application. It ships a web UI, a CLI, TutorBots, memory, a quiz engine, and a math animator — you use it, you don’t assemble it from primitives. Under the hood it still leverages HKUDS’s own RAG stack (LightRAG, RAG-Anything).
Can I integrate DeepTutor with OpenClaw or Claude Code?
Yes. The agent-native CLI outputs structured JSON on demand and accepts SKILL.md files, so any agent that can run shell commands can drive DeepTutor. That makes it a natural fit for OpenClaw subagents and Claude Code pipelines.
Verdict
DeepTutor v1.0 is the most complete open-source learning assistant I’ve seen. The agent-native rewrite gives it a serious architectural edge over NotebookLM-clones, the TutorBots concept is genuinely novel, and the guided installer is best-in-class. It’s a little raw in v1.0 — Math Animator needs work, docs are lagging, and the feature set is wide — but the release cadence (four versions in a week) and community traction (16K stars, 10K in 39 days) suggest this will age well.
If you self-host any AI tooling at all, DeepTutor is worth a weekend.
Star the repo, clone it, run python scripts/start_tour.py, and spin up your first TutorBot. That’s the fastest way to see whether it fits how you learn.
Links:
- GitHub: HKUDS/DeepTutor
- Website: hkuds.github.io/DeepTutor
- Discord: discord.gg/eRsjPgMU4t
- License: Apache-2.0