AI agents · OpenClaw · self-hosting · automation

Quick Answer

How to Use Vector Databases for AI Applications

Published: • Updated:

How to Use Vector Databases for AI Applications

Vector databases store numerical representations (embeddings) of your data, enabling semantic search and AI memory. Use them to build RAG systems that let LLMs answer questions about your documents, products, or knowledge base.

Quick Answer

Traditional databases find exact matches. Vector databases find similar content—essential for AI applications where you need “find documents about X” rather than “find documents containing the word X.”

Step-by-Step Guide

Step 1: Choose a Vector Database

DatabaseBest ForPricing
PineconeProduction apps, managedFree tier, then $70/mo+
WeaviateOpen-source, full-featuredFree (self-hosted)
ChromaLocal development, PythonFree
QdrantHigh performance, RustFree (self-hosted)
MilvusEnterprise scaleFree (self-hosted)
pgvectorPostgreSQL usersFree (extension)

Step 2: Generate Embeddings

Convert your text into vectors using an embedding model:

from openai import OpenAI

client = OpenAI()

def get_embedding(text):
    response = client.embeddings.create(
        input=text,
        model="text-embedding-3-small"
    )
    return response.data[0].embedding

# Example
embedding = get_embedding("How to build an AI agent")
# Returns: [0.023, -0.041, 0.067, ...] (1536 dimensions)

Step 3: Store Vectors

Using Chroma (simplest for local development):

import chromadb

# Create client and collection
client = chromadb.Client()
collection = client.create_collection("my_docs")

# Add documents with embeddings
collection.add(
    documents=["AI agents automate tasks", "RAG improves LLM accuracy"],
    metadatas=[{"source": "doc1"}, {"source": "doc2"}],
    ids=["id1", "id2"]
)

Using Pinecone (production):

from pinecone import Pinecone

pc = Pinecone(api_key="your-api-key")
index = pc.Index("my-index")

# Upsert vectors
index.upsert(
    vectors=[
        {"id": "doc1", "values": embedding, "metadata": {"text": "..."}}
    ]
)

Step 4: Query for Similar Content

# Search for similar documents
results = collection.query(
    query_texts=["How do AI agents work?"],
    n_results=3
)

# Returns most semantically similar documents
print(results['documents'])

Step 5: Build RAG Pipeline

Combine vector search with LLM generation:

def rag_answer(question):
    # 1. Find relevant context
    results = collection.query(
        query_texts=[question],
        n_results=3
    )
    context = "\n".join(results['documents'][0])
    
    # 2. Generate answer with context
    response = client.chat.completions.create(
        model="gpt-4",
        messages=[
            {"role": "system", "content": f"Answer using this context:\n{context}"},
            {"role": "user", "content": question}
        ]
    )
    return response.choices[0].message.content

Common Use Cases

Use CaseHow It Works
Document Q&AEmbed docs → Query → LLM answers
Semantic searchFind similar content by meaning
RecommendationFind similar products/content
Chatbot memoryStore/retrieve conversation history
Code searchFind similar code snippets

Embedding Model Options

ModelDimensionsCostQuality
OpenAI text-embedding-3-small1536$0.02/1M tokensGood
OpenAI text-embedding-3-large3072$0.13/1M tokensBetter
Cohere embed-v31024$0.10/1M tokensExcellent
Voyage-31024$0.06/1M tokensExcellent
all-MiniLM-L6-v2 (local)384FreeGood
BGE-large (local)1024FreeVery good

Best Practices

Chunking Strategy

  • Chunk size: 256-512 tokens is often optimal
  • Overlap: 10-20% overlap between chunks
  • Preserve context: Keep sentences intact

Metadata

Always store useful metadata:

{
    "source": "document_name.pdf",
    "page": 5,
    "section": "Introduction",
    "date": "2026-03-06"
}

Combine vector search with keyword filtering:

results = collection.query(
    query_texts=["AI agents"],
    where={"source": "technical_docs"},  # Filter
    n_results=5
)

Scaling Considerations

ScaleRecommended
< 10K vectorsChroma (local), pgvector
10K-1M vectorsPinecone, Qdrant Cloud
1M-100M vectorsWeaviate, Milvus
> 100M vectorsCustom infrastructure

Last verified: 2026-03-06