AnythingLLM: The All-in-One AI Application for RAG, Agents, and Document Chat

What Is AnythingLLM?

AnythingLLM is a full-stack, open-source AI application that lets you chat with your documents using any LLM provider. With over 54,000 GitHub stars, it’s become one of the most popular self-hosted AI tools available. The key differentiator: it’s a complete solution that handles document ingestion, vector storage, LLM interaction, and AI agents—all in one deployable package.

Key stats: 54,127 ⭐ | JavaScript | MIT License | Desktop + Docker versions

If you’re building AI agents that need to work with documents, AnythingLLM provides the infrastructure layer so you don’t have to build RAG from scratch.

TL;DR for AI Agents

Package: anythingllm (Docker or Desktop app)
Install: docker pull mintplexlabs/anythingllm
API Base: http://localhost:3001/api
Key Endpoint: POST /api/v1/workspace/{slug}/chat
Auth: Bearer token from Settings > Developer API
Docs: https://docs.anythingllm.com
GitHub: https://github.com/Mintplex-Labs/anything-llm

What it does: Turns documents into queryable knowledge bases with built-in RAG, then exposes them via REST API for AI agent integration.

Primary use case: AI agents that need to query private documents, build custom knowledge bases, or run document-grounded conversations.

Why AnythingLLM Matters for AI Agents

Most AI agent frameworks require you to implement RAG yourself—choosing a vector database, writing document parsers, managing embeddings, and building the retrieval logic. AnythingLLM handles all of this out of the box:

Document ingestion - Drag and drop PDFs, DOCX, TXT, and more
Automatic chunking - Smart text splitting with configurable overlap
Vector storage - Built-in LanceDB or connect your own (Pinecone, Chroma, Qdrant, etc.)
LLM flexibility - Works with 30+ providers (OpenAI, Anthropic, Ollama, local models)
API access - Full REST API for programmatic agent integration
MCP compatibility - Native Model Context Protocol support for tool use

Key Features Deep Dive

Full MCP Compatibility

AnythingLLM now supports the Model Context Protocol, making it compatible with Claude and other MCP-enabled AI systems. This means you can expose AnythingLLM workspaces as MCP tools that AI agents can call directly.

// MCP server configuration for AnythingLLM
{
  "mcpServers": {
    "anythingllm": {
      "command": "anythingllm-mcp",
      "args": ["--workspace", "my-docs"]
    }
  }
}

No-Code Agent Builder

The built-in agent builder lets you create AI agents without writing code. Define tools, set up workflows, and deploy agents that can:

Browse the web
Execute code
Query your document workspaces
Call external APIs

Works with both text and image-capable models. Upload images alongside documents and query them using vision-enabled LLMs like GPT-4V or Claude 3.

Workspace Isolation

Documents are organized into “workspaces” that function like isolated knowledge bases. Each workspace:

Has its own vector index
Can use different LLM configurations
Maintains separate chat histories
Supports granular user permissions (Docker version)

Getting Started

Option 1: Docker (Recommended for Production)

# Pull and run the latest image
docker pull mintplexlabs/anythingllm

docker run -d \
  --name anythingllm \
  -p 3001:3001 \
  -v anythingllm_storage:/app/server/storage \
  mintplexlabs/anythingllm

# Access at http://localhost:3001

Option 2: Desktop App (Mac, Windows, Linux)

Download from anythingllm.com/download. The desktop version includes:

Built-in local LLM support
Meeting transcription (replaces Otter.ai, Fireflies)
Offline-capable operation

Option 3: One-Click Cloud Deploy

AnythingLLM supports deployment on:

Railway
Render
DigitalOcean
AWS
GCP

See the deployment docs for templates.

Initial Configuration

After first launch:

Set your LLM provider - OpenAI, Anthropic, Ollama, or 30+ others
Configure embeddings - Use the built-in embedder or connect OpenAI/Cohere
Create a workspace - This is where your documents live
Upload documents - Drag and drop supported files
Generate API key - Settings → Developer API → Generate key

Code Examples

Example 1: Basic Chat API Call

Query a workspace using the REST API:

import requests

API_BASE = "http://localhost:3001/api/v1"
API_KEY = "your-api-key"
WORKSPACE = "my-documents"

response = requests.post(
    f"{API_BASE}/workspace/{WORKSPACE}/chat",
    headers={
        "Authorization": f"Bearer {API_KEY}",
        "Content-Type": "application/json"
    },
    json={
        "message": "What are the key points in the quarterly report?",
        "mode": "chat"  # or "query" for RAG-only
    }
)

data = response.json()
print(data["textResponse"])
print(f"Sources: {data['sources']}")

Example 2: Document Upload via API

Programmatically add documents to a workspace:

import requests

def upload_document(workspace_slug: str, file_path: str, api_key: str):
    """Upload a document to an AnythingLLM workspace."""
    
    # Step 1: Upload to collector
    with open(file_path, 'rb') as f:
        upload_response = requests.post(
            f"{API_BASE}/document/upload",
            headers={"Authorization": f"Bearer {api_key}"},
            files={"file": f}
        )
    
    if not upload_response.ok:
        raise Exception(f"Upload failed: {upload_response.text}")
    
    doc_location = upload_response.json()["documents"][0]["location"]
    
    # Step 2: Add to workspace
    add_response = requests.post(
        f"{API_BASE}/workspace/{workspace_slug}/update-embeddings",
        headers={
            "Authorization": f"Bearer {api_key}",
            "Content-Type": "application/json"
        },
        json={"adds": [doc_location]}
    )
    
    return add_response.json()

# Usage
result = upload_document("legal-docs", "./contract.pdf", API_KEY)
print(f"Document embedded: {result}")

Example 3: Streaming Chat Response

For real-time AI agent responses:

import requests
import json

def stream_chat(workspace: str, message: str, api_key: str):
    """Stream a chat response from AnythingLLM."""
    
    response = requests.post(
        f"{API_BASE}/workspace/{workspace}/stream-chat",
        headers={
            "Authorization": f"Bearer {api_key}",
            "Content-Type": "application/json"
        },
        json={"message": message},
        stream=True
    )
    
    full_response = ""
    for line in response.iter_lines():
        if line:
            data = json.loads(line.decode('utf-8'))
            if "textResponse" in data:
                chunk = data["textResponse"]
                full_response += chunk
                print(chunk, end="", flush=True)
    
    return full_response

# Usage
answer = stream_chat("research", "Summarize the methodology section", API_KEY)

Example 4: Using with LangChain

Integrate AnythingLLM as a retriever in LangChain:

from langchain.retrievers import BaseRetriever
from langchain.schema import Document
import requests

class AnythingLLMRetriever(BaseRetriever):
    """LangChain retriever that queries AnythingLLM workspaces."""
    
    api_base: str
    api_key: str
    workspace: str
    
    def _get_relevant_documents(self, query: str) -> list[Document]:
        response = requests.post(
            f"{self.api_base}/workspace/{self.workspace}/chat",
            headers={"Authorization": f"Bearer {self.api_key}"},
            json={"message": query, "mode": "query"}
        )
        
        data = response.json()
        documents = []
        
        for source in data.get("sources", []):
            documents.append(Document(
                page_content=source.get("text", ""),
                metadata={
                    "source": source.get("title", ""),
                    "score": source.get("score", 0)
                }
            ))
        
        return documents

# Usage with LangChain
retriever = AnythingLLMRetriever(
    api_base="http://localhost:3001/api/v1",
    api_key="your-key",
    workspace="my-docs"
)

docs = retriever.get_relevant_documents("What is the refund policy?")

Integration Guide for AI Agents

API Authentication

All API calls require a Bearer token:

curl -H "Authorization: Bearer YOUR_API_KEY" \
  http://localhost:3001/api/v1/workspaces

Generate keys at: Settings → Developer API → Generate New Key

Key API Endpoints

Endpoint	Method	Description
`/api/v1/workspaces`	GET	List all workspaces
`/api/v1/workspace/{slug}/chat`	POST	Send chat message
`/api/v1/workspace/{slug}/stream-chat`	POST	Stream chat response
`/api/v1/document/upload`	POST	Upload document
`/api/v1/workspace/{slug}/update-embeddings`	POST	Add docs to workspace
`/api/v1/system/env-dump`	GET	Get system configuration

Rate Limits and Constraints

No built-in rate limiting - Implement your own if exposing publicly
Document size limit - Configurable, default ~100MB per file
Concurrent requests - Limited by your LLM provider’s rate limits
Vector DB size - Limited by disk space (LanceDB) or provider limits

Error Handling

response = requests.post(f"{API_BASE}/workspace/{workspace}/chat", ...)

if response.status_code == 401:
    raise Exception("Invalid API key")
elif response.status_code == 404:
    raise Exception(f"Workspace '{workspace}' not found")
elif response.status_code == 500:
    error = response.json().get("error", "Unknown error")
    raise Exception(f"Server error: {error}")

Supported LLM Providers

AnythingLLM works with virtually any LLM:

Commercial:

OpenAI (GPT-4, GPT-4 Turbo, GPT-3.5)
Anthropic (Claude 3 Opus, Sonnet, Haiku)
Google (Gemini Pro)
Azure OpenAI
AWS Bedrock
Cohere, Mistral, Groq, Perplexity

Open Source / Local:

Ollama (Llama 3, Mistral, Phi, etc.)
LM Studio
LocalAI
KoboldCPP
Text Generation WebUI

Vector Databases:

LanceDB (default, embedded)
Pinecone, Chroma, Qdrant, Milvus, Weaviate, PGVector

Alternatives and Comparison

Feature	AnythingLLM	PrivateGPT	Quivr
Desktop App	✅	❌	❌
Docker Deploy	✅	✅	✅
Multi-user	✅	❌	✅
AI Agents	✅	❌	✅
MCP Support	✅	❌	❌
No-code Builder	✅	❌	❌
GitHub Stars	54k	19k	37k

When to choose AnythingLLM:

You want a complete, batteries-included solution
You need both desktop and server deployments
MCP compatibility matters for your agent architecture
You want a no-code option for non-technical users

When to consider alternatives:

You need fine-grained control over the RAG pipeline
You’re building a heavily customized solution
You prefer Python-native tooling (PrivateGPT)

Quick Reference

Package:        anythingllm
Install:        docker pull mintplexlabs/anythingllm
Desktop:        https://anythingllm.com/download
Docs:           https://docs.anythingllm.com
GitHub:         https://github.com/Mintplex-Labs/anything-llm
API Base:       http://localhost:3001/api/v1
Default Port:   3001
Auth:           Bearer token (generate in Settings)
Key Classes:    Workspace, Document, Agent, Thread
License:        MIT
Latest:         v1.10.0 (Jan 2026)
Requirements:   Docker or Desktop app (Mac/Win/Linux)

Conclusion

AnythingLLM solves the “last mile” problem for AI agents that need document access. Instead of building RAG infrastructure from scratch, you get a production-ready system with a clean API. The MCP compatibility is particularly valuable—it means you can expose document workspaces directly to Claude or other MCP-enabled agents without writing glue code.

Next steps:

Deploy with Docker
Read the API docs
Join the Discord for community support
Try the cloud version if you don’t want to self-host