TL;DR

Vector databases power semantic search, chatbot memory, recommendations, and duplicate detection. This guide shows real implementations: searching documents by meaning, building chatbots that remember context, finding similar products, and detecting duplicate content—with code examples you can adapt.

Why examples matter

Understanding vector databases conceptually is one thing. Seeing them in action is another. This guide walks through practical implementations you can build today.

Example 1: Semantic search for documentation

Problem: Users can't find answers in your docs because they search with different words than your documentation uses.

Solution: Convert all documentation to vectors, search by meaning instead of keywords.

How it works

  1. Index documents: Split docs into chunks, convert to embeddings, store in vector DB
  2. Search: Convert user query to embedding, find most similar chunks
  3. Display: Show relevant documentation sections

Code example (Python + Pinecone)

import pinecone
from openai import OpenAI

# Initialize
client = OpenAI()
pinecone.init(api_key="your-key")
index = pinecone.Index("docs")

# Index a document
def index_document(text, doc_id):
    # Split into chunks (simplified)
    chunks = [text[i:i+1000] for i in range(0, len(text), 1000)]

    for i, chunk in enumerate(chunks):
        # Create embedding
        response = client.embeddings.create(
            model="text-embedding-3-small",
            input=chunk
        )
        embedding = response.data[0].embedding

        # Store in Pinecone
        index.upsert([(
            f"{doc_id}-{i}",
            embedding,
            {"text": chunk, "doc_id": doc_id}
        )])

# Search documents
def search_docs(query, top_k=5):
    # Convert query to embedding
    response = client.embeddings.create(
        model="text-embedding-3-small",
        input=query
    )
    query_embedding = response.data[0].embedding

    # Search similar vectors
    results = index.query(
        vector=query_embedding,
        top_k=top_k,
        include_metadata=True
    )

    # Return matching text
    return [match['metadata']['text'] for match in results['matches']]

# Usage
search_docs("How do I reset my password?")
# Returns relevant docs even if they say "change credentials" instead

Real companies using this:

  • Notion AI searches your workspace semantically
  • GitHub Copilot searches code repositories by intent
  • Stripe docs use semantic search to find relevant API information

Example 2: Chatbot with conversation memory

Problem: Chatbots forget previous messages, making multi-turn conversations frustrating.

Solution: Store conversation history as vectors, retrieve relevant context for each message.

How it works

  1. Store messages: Each user message and AI response → embedding → vector DB
  2. Retrieve context: For new message, find similar past conversations
  3. Augment prompt: Feed relevant history to LLM with new message

Code example (Python + Chroma)

import chromadb
from openai import OpenAI

client = OpenAI()
chroma_client = chromadb.Client()
collection = chroma_client.create_collection("chat_history")

def save_message(user_id, message, role):
    """Store message in vector DB"""
    collection.add(
        documents=[message],
        metadatas=[{"user_id": user_id, "role": role}],
        ids=[f"{user_id}-{len(collection.get())}"]
    )

def get_relevant_history(user_id, current_message, n=5):
    """Retrieve relevant past messages"""
    results = collection.query(
        query_texts=[current_message],
        n_results=n,
        where={"user_id": user_id}
    )
    return results['documents'][0]

def chat(user_id, message):
    """Chat with memory"""
    # Get relevant past context
    context = get_relevant_history(user_id, message)

    # Build prompt with context
    prompt = f"""Previous relevant conversation:
{chr(10).join(context)}

Current message: {message}

Respond naturally, referencing past conversation when relevant."""

    # Get AI response
    response = client.chat.completions.create(
        model="gpt-4",
        messages=[{"role": "user", "content": prompt}]
    )

    ai_message = response.choices[0].message.content

    # Save both messages
    save_message(user_id, message, "user")
    save_message(user_id, ai_message, "assistant")

    return ai_message

# Usage
chat("user123", "My name is Alice")
chat("user123", "What's the weather like?")
chat("user123", "What did I tell you my name was?")
# Chatbot remembers Alice even though it was mentioned earlier

Real examples:

  • ChatGPT uses vector search for long-term memory features
  • Customer service bots reference past support tickets
  • Personal AI assistants remember user preferences

Example 3: Product recommendation system

Problem: Show users products similar to what they're viewing.

Solution: Convert product descriptions to vectors, find nearest neighbors.

How it works

  1. Index products: Product title + description → embedding → vector DB
  2. Find similar: When user views product, search for similar vectors
  3. Display recommendations: Show top matches

Code example (Python + Qdrant)

from qdrant_client import QdrantClient
from qdrant_client.models import Distance, VectorParams, PointStruct
from openai import OpenAI

client = OpenAI()
qdrant = QdrantClient(":memory:")  # Use actual server in production

# Create collection
qdrant.create_collection(
    collection_name="products",
    vectors_config=VectorParams(size=1536, distance=Distance.COSINE)
)

def index_product(product_id, name, description, price, category):
    """Add product to vector DB"""
    # Create embedding from name + description
    text = f"{name}. {description}"
    response = client.embeddings.create(
        model="text-embedding-3-small",
        input=text
    )
    embedding = response.data[0].embedding

    # Store in Qdrant
    qdrant.upsert(
        collection_name="products",
        points=[PointStruct(
            id=product_id,
            vector=embedding,
            payload={
                "name": name,
                "description": description,
                "price": price,
                "category": category
            }
        )]
    )

def recommend_similar(product_id, limit=5):
    """Find similar products"""
    # Get product vector
    product = qdrant.retrieve(
        collection_name="products",
        ids=[product_id]
    )[0]

    # Search similar
    results = qdrant.search(
        collection_name="products",
        query_vector=product.vector,
        limit=limit + 1  # +1 because first result is the product itself
    )

    # Return recommendations (skip first = self)
    return [
        {
            "name": hit.payload["name"],
            "price": hit.payload["price"],
            "similarity": hit.score
        }
        for hit in results[1:]
    ]

# Usage
index_product(1, "Blue Running Shoes", "Lightweight athletic shoes for daily training", 79.99, "footwear")
index_product(2, "Trail Running Sneakers", "Durable shoes for off-road running", 99.99, "footwear")
index_product(3, "Yoga Mat", "Non-slip exercise mat", 29.99, "fitness")

recommend_similar(1)
# Returns trail running sneakers (similar product) but not yoga mat

Real examples:

  • Amazon's "Customers who bought this also bought"
  • Spotify's song recommendations
  • Netflix's movie suggestions

Example 4: Duplicate content detection

Problem: Users submit duplicate support tickets or questions already answered.

Solution: Check if incoming content is similar to existing entries.

How it works

  1. Index existing content: All past tickets/questions → embeddings
  2. Check new content: Convert to embedding, search for high similarity matches
  3. Suggest duplicates: If similarity > threshold, show user existing content

Code example (Python + pgvector)

import psycopg2
from pgvector.psycopg2 import register_vector
from openai import OpenAI

client = OpenAI()
conn = psycopg2.connect(database="support_db")
register_vector(conn)

# Setup (run once)
def setup_database():
    cur = conn.cursor()
    cur.execute('CREATE EXTENSION IF NOT EXISTS vector')
    cur.execute('''
        CREATE TABLE IF NOT EXISTS tickets (
            id SERIAL PRIMARY KEY,
            content TEXT,
            embedding vector(1536),
            status TEXT
        )
    ''')
    conn.commit()

def add_ticket(content):
    """Add new support ticket"""
    # Create embedding
    response = client.embeddings.create(
        model="text-embedding-3-small",
        input=content
    )
    embedding = response.data[0].embedding

    # Check for duplicates first
    duplicates = find_similar_tickets(embedding)
    if duplicates:
        return {
            "duplicate": True,
            "similar_tickets": duplicates
        }

    # Not a duplicate, add ticket
    cur = conn.cursor()
    cur.execute(
        'INSERT INTO tickets (content, embedding, status) VALUES (%s, %s, %s)',
        (content, embedding, 'open')
    )
    conn.commit()
    return {"duplicate": False, "ticket_id": cur.lastrowid}

def find_similar_tickets(embedding, threshold=0.85):
    """Find tickets with similarity above threshold"""
    cur = conn.cursor()
    cur.execute('''
        SELECT id, content, 1 - (embedding <=> %s) AS similarity
        FROM tickets
        WHERE 1 - (embedding <=> %s) > %s
        ORDER BY similarity DESC
        LIMIT 5
    ''', (embedding, embedding, threshold))

    return [
        {"id": row[0], "content": row[1], "similarity": row[2]}
        for row in cur.fetchall()
    ]

# Usage
add_ticket("I can't log in to my account")
add_ticket("Unable to access my account login")  # Detected as duplicate!

Real examples:

  • Stack Overflow suggests similar questions while you type
  • GitHub detects duplicate issues
  • Zendesk flags similar support tickets

Example 5: Multi-modal search (text + images)

Problem: Search images using text descriptions.

Solution: Use multi-modal embeddings (like CLIP) to search images by semantic meaning.

Simplified example

from openai import OpenAI
import pinecone

client = OpenAI()
pinecone.init(api_key="your-key")
index = pinecone.Index("images")

def index_image(image_url, caption, image_id):
    """Index image with its caption"""
    # Use caption as embedding (simplified)
    # In production, use CLIP or similar multi-modal model
    response = client.embeddings.create(
        model="text-embedding-3-small",
        input=caption
    )
    embedding = response.data[0].embedding

    index.upsert([(
        image_id,
        embedding,
        {"image_url": image_url, "caption": caption}
    )])

def search_images(text_query, top_k=10):
    """Search images using text"""
    response = client.embeddings.create(
        model="text-embedding-3-small",
        input=text_query
    )
    query_embedding = response.data[0].embedding

    results = index.query(
        vector=query_embedding,
        top_k=top_k,
        include_metadata=True
    )

    return [match['metadata'] for match in results['matches']]

# Usage
search_images("sunset over mountains")
# Returns images matching that description, even if captions use different words

Real examples:

  • Google Photos search ("show me pictures of dogs")
  • Pinterest visual search
  • Shopify product image search

Choosing the right vector database

Based on these examples, here's what to use:

Pinecone: Easiest for prototypes and MVPs, fully managed
Chroma: Best for local development and small-scale apps
Qdrant: Great for production, excellent filtering and performance
pgvector: Use if you already have PostgreSQL
Weaviate: Best for hybrid search (keywords + vectors)

Performance tips

From real implementations:

Chunking matters: Test different chunk sizes (256, 512, 1024 tokens) for your use case

Metadata filtering: Filter by category/date first, then vector search within results

Batch operations: Insert/update in batches of 100-1000, not one-by-one

Cache embeddings: Don't re-embed the same content repeatedly

Monitor recall: Measure whether you're actually finding the right results

What to try next

Pick one example and implement it:

  1. Start with the semantic search example (simplest)
  2. Use a free vector DB tier (Pinecone, Chroma)
  3. Index 50-100 items to test
  4. Measure search quality
  5. Iterate on chunking and retrieval

Related reading: