What Is AnythingLLM?
AnythingLLM is a full-stack, open-source AI application that lets you chat with your documents using any LLM provider. With over 54,000 GitHub stars, it’s become one of the most popular self-hosted AI tools available. The key differentiator: it’s a complete solution that handles document ingestion, vector storage, LLM interaction, and AI agents—all in one deployable package.
Key stats: 54,127 ⭐ | JavaScript | MIT License | Desktop + Docker versions
If you’re building AI agents that need to work with documents, AnythingLLM provides the infrastructure layer so you don’t have to build RAG from scratch.
TL;DR for AI Agents
Package: anythingllm (Docker or Desktop app)
Install: docker pull mintplexlabs/anythingllm
API Base: http://localhost:3001/api
Key Endpoint: POST /api/v1/workspace/{slug}/chat
Auth: Bearer token from Settings > Developer API
Docs: https://docs.anythingllm.com
GitHub: https://github.com/Mintplex-Labs/anything-llm
What it does: Turns documents into queryable knowledge bases with built-in RAG, then exposes them via REST API for AI agent integration.
Primary use case: AI agents that need to query private documents, build custom knowledge bases, or run document-grounded conversations.
Why AnythingLLM Matters for AI Agents
Most AI agent frameworks require you to implement RAG yourself—choosing a vector database, writing document parsers, managing embeddings, and building the retrieval logic. AnythingLLM handles all of this out of the box:
- Document ingestion - Drag and drop PDFs, DOCX, TXT, and more
- Automatic chunking - Smart text splitting with configurable overlap
- Vector storage - Built-in LanceDB or connect your own (Pinecone, Chroma, Qdrant, etc.)
- LLM flexibility - Works with 30+ providers (OpenAI, Anthropic, Ollama, local models)
- API access - Full REST API for programmatic agent integration
- MCP compatibility - Native Model Context Protocol support for tool use
Key Features Deep Dive
Full MCP Compatibility
AnythingLLM now supports the Model Context Protocol, making it compatible with Claude and other MCP-enabled AI systems. This means you can expose AnythingLLM workspaces as MCP tools that AI agents can call directly.
// MCP server configuration for AnythingLLM
{
"mcpServers": {
"anythingllm": {
"command": "anythingllm-mcp",
"args": ["--workspace", "my-docs"]
}
}
}
No-Code Agent Builder
The built-in agent builder lets you create AI agents without writing code. Define tools, set up workflows, and deploy agents that can:
- Browse the web
- Execute code
- Query your document workspaces
- Call external APIs
Multi-Modal Support
Works with both text and image-capable models. Upload images alongside documents and query them using vision-enabled LLMs like GPT-4V or Claude 3.
Workspace Isolation
Documents are organized into “workspaces” that function like isolated knowledge bases. Each workspace:
- Has its own vector index
- Can use different LLM configurations
- Maintains separate chat histories
- Supports granular user permissions (Docker version)
Getting Started
Option 1: Docker (Recommended for Production)
# Pull and run the latest image
docker pull mintplexlabs/anythingllm
docker run -d \
--name anythingllm \
-p 3001:3001 \
-v anythingllm_storage:/app/server/storage \
mintplexlabs/anythingllm
# Access at http://localhost:3001
Option 2: Desktop App (Mac, Windows, Linux)
Download from anythingllm.com/download. The desktop version includes:
- Built-in local LLM support
- Meeting transcription (replaces Otter.ai, Fireflies)
- Offline-capable operation
Option 3: One-Click Cloud Deploy
AnythingLLM supports deployment on:
- Railway
- Render
- DigitalOcean
- AWS
- GCP
See the deployment docs for templates.
Initial Configuration
After first launch:
- Set your LLM provider - OpenAI, Anthropic, Ollama, or 30+ others
- Configure embeddings - Use the built-in embedder or connect OpenAI/Cohere
- Create a workspace - This is where your documents live
- Upload documents - Drag and drop supported files
- Generate API key - Settings → Developer API → Generate key
Code Examples
Example 1: Basic Chat API Call
Query a workspace using the REST API:
import requests
API_BASE = "http://localhost:3001/api/v1"
API_KEY = "your-api-key"
WORKSPACE = "my-documents"
response = requests.post(
f"{API_BASE}/workspace/{WORKSPACE}/chat",
headers={
"Authorization": f"Bearer {API_KEY}",
"Content-Type": "application/json"
},
json={
"message": "What are the key points in the quarterly report?",
"mode": "chat" # or "query" for RAG-only
}
)
data = response.json()
print(data["textResponse"])
print(f"Sources: {data['sources']}")
Example 2: Document Upload via API
Programmatically add documents to a workspace:
import requests
def upload_document(workspace_slug: str, file_path: str, api_key: str):
"""Upload a document to an AnythingLLM workspace."""
# Step 1: Upload to collector
with open(file_path, 'rb') as f:
upload_response = requests.post(
f"{API_BASE}/document/upload",
headers={"Authorization": f"Bearer {api_key}"},
files={"file": f}
)
if not upload_response.ok:
raise Exception(f"Upload failed: {upload_response.text}")
doc_location = upload_response.json()["documents"][0]["location"]
# Step 2: Add to workspace
add_response = requests.post(
f"{API_BASE}/workspace/{workspace_slug}/update-embeddings",
headers={
"Authorization": f"Bearer {api_key}",
"Content-Type": "application/json"
},
json={"adds": [doc_location]}
)
return add_response.json()
# Usage
result = upload_document("legal-docs", "./contract.pdf", API_KEY)
print(f"Document embedded: {result}")
Example 3: Streaming Chat Response
For real-time AI agent responses:
import requests
import json
def stream_chat(workspace: str, message: str, api_key: str):
"""Stream a chat response from AnythingLLM."""
response = requests.post(
f"{API_BASE}/workspace/{workspace}/stream-chat",
headers={
"Authorization": f"Bearer {api_key}",
"Content-Type": "application/json"
},
json={"message": message},
stream=True
)
full_response = ""
for line in response.iter_lines():
if line:
data = json.loads(line.decode('utf-8'))
if "textResponse" in data:
chunk = data["textResponse"]
full_response += chunk
print(chunk, end="", flush=True)
return full_response
# Usage
answer = stream_chat("research", "Summarize the methodology section", API_KEY)
Example 4: Using with LangChain
Integrate AnythingLLM as a retriever in LangChain:
from langchain.retrievers import BaseRetriever
from langchain.schema import Document
import requests
class AnythingLLMRetriever(BaseRetriever):
"""LangChain retriever that queries AnythingLLM workspaces."""
api_base: str
api_key: str
workspace: str
def _get_relevant_documents(self, query: str) -> list[Document]:
response = requests.post(
f"{self.api_base}/workspace/{self.workspace}/chat",
headers={"Authorization": f"Bearer {self.api_key}"},
json={"message": query, "mode": "query"}
)
data = response.json()
documents = []
for source in data.get("sources", []):
documents.append(Document(
page_content=source.get("text", ""),
metadata={
"source": source.get("title", ""),
"score": source.get("score", 0)
}
))
return documents
# Usage with LangChain
retriever = AnythingLLMRetriever(
api_base="http://localhost:3001/api/v1",
api_key="your-key",
workspace="my-docs"
)
docs = retriever.get_relevant_documents("What is the refund policy?")
Integration Guide for AI Agents
API Authentication
All API calls require a Bearer token:
curl -H "Authorization: Bearer YOUR_API_KEY" \
http://localhost:3001/api/v1/workspaces
Generate keys at: Settings → Developer API → Generate New Key
Key API Endpoints
| Endpoint | Method | Description |
|---|---|---|
/api/v1/workspaces | GET | List all workspaces |
/api/v1/workspace/{slug}/chat | POST | Send chat message |
/api/v1/workspace/{slug}/stream-chat | POST | Stream chat response |
/api/v1/document/upload | POST | Upload document |
/api/v1/workspace/{slug}/update-embeddings | POST | Add docs to workspace |
/api/v1/system/env-dump | GET | Get system configuration |
Rate Limits and Constraints
- No built-in rate limiting - Implement your own if exposing publicly
- Document size limit - Configurable, default ~100MB per file
- Concurrent requests - Limited by your LLM provider’s rate limits
- Vector DB size - Limited by disk space (LanceDB) or provider limits
Error Handling
response = requests.post(f"{API_BASE}/workspace/{workspace}/chat", ...)
if response.status_code == 401:
raise Exception("Invalid API key")
elif response.status_code == 404:
raise Exception(f"Workspace '{workspace}' not found")
elif response.status_code == 500:
error = response.json().get("error", "Unknown error")
raise Exception(f"Server error: {error}")
Supported LLM Providers
AnythingLLM works with virtually any LLM:
Commercial:
- OpenAI (GPT-4, GPT-4 Turbo, GPT-3.5)
- Anthropic (Claude 3 Opus, Sonnet, Haiku)
- Google (Gemini Pro)
- Azure OpenAI
- AWS Bedrock
- Cohere, Mistral, Groq, Perplexity
Open Source / Local:
- Ollama (Llama 3, Mistral, Phi, etc.)
- LM Studio
- LocalAI
- KoboldCPP
- Text Generation WebUI
Vector Databases:
- LanceDB (default, embedded)
- Pinecone, Chroma, Qdrant, Milvus, Weaviate, PGVector
Alternatives and Comparison
| Feature | AnythingLLM | PrivateGPT | Quivr |
|---|---|---|---|
| Desktop App | ✅ | ❌ | ❌ |
| Docker Deploy | ✅ | ✅ | ✅ |
| Multi-user | ✅ | ❌ | ✅ |
| AI Agents | ✅ | ❌ | ✅ |
| MCP Support | ✅ | ❌ | ❌ |
| No-code Builder | ✅ | ❌ | ❌ |
| GitHub Stars | 54k | 19k | 37k |
When to choose AnythingLLM:
- You want a complete, batteries-included solution
- You need both desktop and server deployments
- MCP compatibility matters for your agent architecture
- You want a no-code option for non-technical users
When to consider alternatives:
- You need fine-grained control over the RAG pipeline
- You’re building a heavily customized solution
- You prefer Python-native tooling (PrivateGPT)
Quick Reference
Package: anythingllm
Install: docker pull mintplexlabs/anythingllm
Desktop: https://anythingllm.com/download
Docs: https://docs.anythingllm.com
GitHub: https://github.com/Mintplex-Labs/anything-llm
API Base: http://localhost:3001/api/v1
Default Port: 3001
Auth: Bearer token (generate in Settings)
Key Classes: Workspace, Document, Agent, Thread
License: MIT
Latest: v1.10.0 (Jan 2026)
Requirements: Docker or Desktop app (Mac/Win/Linux)
Conclusion
AnythingLLM solves the “last mile” problem for AI agents that need document access. Instead of building RAG infrastructure from scratch, you get a production-ready system with a clean API. The MCP compatibility is particularly valuable—it means you can expose document workspaces directly to Claude or other MCP-enabled agents without writing glue code.
Next steps:
- Deploy with Docker
- Read the API docs
- Join the Discord for community support
- Try the cloud version if you don’t want to self-host