Intermediate7 min read

Vector Database Fundamentals

Vector databases store and search embeddings efficiently. Learn how they work, when to use them, and popular options.

vector databasesembeddingsinfrastructuretechnical

TL;DR

Vector databases store embeddings and enable fast similarity search. Essential for RAG, recommendations, and semantic search at scale.

What is a vector database?

Definition:
A database optimized for storing and searching high-dimensional vectors (embeddings).

Why not regular databases?

Regular DBs: Exact match, keyword search
Vector DBs: Similarity search, semantic matching
Vector DBs: Optimized for high-dimensional data

How they work

Index creation:

Generate embeddings for documents
Store vectors with metadata
Build efficient index (HNSW, IVF, etc.)

Search:

Convert query to embedding
Find k nearest neighbors
Return similar items

Vector similarity search

Nearest neighbor search:

Find items closest to query vector
Measure: cosine similarity, dot product, L2 distance

Example:

Query: "How to fix a leak?"
Returns: Documents about plumbing repairs
Even if exact words don't match

Popular vector databases

Pinecone:

Managed service
Easy to use
Auto-scaling
Paid

Weaviate:

Open source or managed
Multi-modal support
Hybrid search
Self-host or cloud

Qdrant:

Open source
Rust-based (fast)
Good filtering
Self-host or cloud

Chroma:

Lightweight
Great for prototyping
Open source
Embedded mode

Milvus:

Open source
Highly scalable
Enterprise features

Pgvector (Postgres extension):

Add vectors to existing Postgres
Familiar SQL interface
Good for small-medium scale

Key features

Metadata filtering:

Search within filtered subset
"Find similar docs from 2024"

Hybrid search:

Combine vector + keyword search
Best of both worlds

Multi-tenancy:

Isolate data by user/org
Important for SaaS

Scalability:

Millions-billions of vectors
Distributed architecture

When to use vector databases

Good for:

RAG systems (document retrieval)
Semantic search
Recommendation engines
Duplicate detection
Anomaly detection

Overkill for:

Small datasets (< 10K items)
Exact match search
Simple keyword search

Implementation example

import pinecone
from openai import OpenAI

# Initialize
pinecone.init(api_key="...")
index = pinecone.Index("my-index")
openai_client = OpenAI()

# Index document
text = "How to reset password..."
embedding = openai_client.embeddings.create(
    input=text,
    model="text-embedding-3-small"
).data[0].embedding

index.upsert([("doc1", embedding, {"text": text})])

# Search
query = "forgot my password"
query_emb = openai_client.embeddings.create(
    input=query,
    model="text-embedding-3-small"
).data[0].embedding

results = index.query(vector=query_emb, top_k=3)

Performance optimization

Indexing algorithms:

HNSW: Fast, accurate (most popular)
IVF: Good for large datasets
Product Quantization: Compress vectors

Trade-offs:

Accuracy vs speed
Memory vs disk
Indexing time vs query time

Cost considerations

Managed services:

Pay per vector stored
Pay per query
$50-500+/month typical

Self-hosted:

Server costs
Maintenance effort
More control

Best practices

Choose embedding model carefully
Experiment with index parameters
Use metadata filtering
Monitor query performance
Implement caching for common queries

What's next

Embeddings Explained
RAG Systems
Semantic Search

Was this guide helpful?

Your feedback helps us improve our guides

Key Terms Used in This Guide

Embedding

A list of numbers that represents the meaning of text. Similar meanings have similar numbers, so computers can compare by 'closeness'.

Vector Database

A database optimized for storing and searching embeddings (number lists). Finds similar items by comparing their vectors.

Related Guides

Fine-Tuning Fundamentals: Customizing AI Models

Intermediate

Fine-tuning adapts pre-trained models to your specific use case. Learn when to fine-tune, how it works, and alternatives.

8 min read

Retrieval Strategies for RAG Systems

Intermediate

RAG systems retrieve relevant context before generating responses. Learn retrieval strategies, ranking, and optimization techniques.

7 min read

Semantic Search: Search by Meaning, Not Keywords

Intermediate

Semantic search finds results based on meaning, not exact keyword matches. Learn how it works and how to implement it.

6 min read

TL;DR

What is a vector database?

How they work

Vector similarity search

Popular vector databases

Key features

When to use vector databases

Implementation example

Performance optimization

Cost considerations

Best practices

What&#39;s next

Was this guide helpful?

Key Terms Used in This Guide

Embedding

Vector Database

Related Guides

Fine-Tuning Fundamentals: Customizing AI Models

Retrieval Strategies for RAG Systems

Semantic Search: Search by Meaning, Not Keywords

What's next