Semantic Search: Search by Meaning, Not Keywords
Semantic search finds results based on meaning, not exact keyword matches. Learn how it works and how to implement it.
TL;DR
Semantic search uses embeddings to find results by meaning, not just keywords. "How to fix a drip" matches "repairing leaky faucets" even without shared words.
How semantic search works
- Convert documents to embeddings (vectors)
- User searches: "best running shoes"
- Convert query to embedding
- Find nearest document vectors
- Return top matches
Semantic vs keyword search
Keyword (BM25):
- Matches exact words
- "Python tutorial" finds "Python" + "tutorial"
- Misses synonyms, concepts
Semantic:
- Matches meaning
- "Python tutorial" finds "Learn to code in Python"
- Understands intent
Hybrid (best):
- Combines both
- Keyword for precision, semantic for recall
Implementation steps
1. Index documents:
- Chunk documents
- Generate embeddings
- Store in vector database
2. Search:
- Embed user query
- Find k-nearest neighbors
- Return results
3. Rank and display:
- Optional reranking
- Present to user
Choosing an embedding model
General-purpose:
- OpenAI text-embedding-3-small
- Sentence Transformers all-MiniLM-L6-v2
Specialized:
- Domain-specific models (legal, medical)
- Fine-tuned on your data
Considerations:
- Speed vs accuracy
- Cost
- Language support
Measuring similarity
Cosine similarity:
- Most common
- -1 to 1 (1 = identical)
Dot product:
- Faster, similar results
- Requires normalized vectors
Euclidean distance:
- Less common for text
- Works for some embeddings
Optimizing performance
Indexing:
- Use approximate nearest neighbor (ANN) algorithms
- HNSW, IVF, Product Quantization
- Trade accuracy for speed
Caching:
- Cache common queries
- Precompute popular results
Filtering:
- Metadata filters before semantic search
- Reduces search space
Use cases
- Document search
- Customer support (find similar tickets)
- E-commerce (visual + text search)
- Code search (find similar functions)
- Legal research (case similarity)
Challenges
- Embeddings don't capture all nuances
- May miss exact-match requirements
- Requires good embedding model
- Computationally more expensive than keyword search
What's next
- Embeddings Explained
- Vector Databases
- RAG Retrieval Strategies
Was this guide helpful?
Your feedback helps us improve our guides
Key Terms Used in This Guide
Beam Search
A text generation strategy where the AI explores multiple possible word sequences simultaneously and keeps the best few (the 'beam') at each step, resulting in higher-quality but slower output than greedy generation.
Embedding
A list of numbers that represents the meaning of text. Similar meanings have similar numbers, so computers can compare by 'closeness'.
RAG (Retrieval-Augmented Generation)
A technique where AI searches your documents for relevant info, then uses it to generate accurate, grounded answers.
Related Guides
Retrieval Strategies for RAG Systems
IntermediateRAG systems retrieve relevant context before generating responses. Learn retrieval strategies, ranking, and optimization techniques.
Vector Database Fundamentals
IntermediateVector databases store and search embeddings efficiently. Learn how they work, when to use them, and popular options.
Training Custom Embedding Models
AdvancedFine-tune or train embedding models for your domain. Improve retrieval quality with domain-specific embeddings.