Intermediate6 min read

Semantic Search: Search by Meaning, Not Keywords

Semantic search finds results based on meaning, not exact keyword matches. Learn how it works and how to implement it.

semantic searchembeddingssearchretrieval

TL;DR

Semantic search uses embeddings to find results by meaning, not just keywords. "How to fix a drip" matches "repairing leaky faucets" even without shared words.

How semantic search works

Convert documents to embeddings (vectors)
User searches: "best running shoes"
Convert query to embedding
Find nearest document vectors
Return top matches

Semantic vs keyword search

Keyword (BM25):

Matches exact words
"Python tutorial" finds "Python" + "tutorial"
Misses synonyms, concepts

Semantic:

Matches meaning
"Python tutorial" finds "Learn to code in Python"
Understands intent

Hybrid (best):

Combines both
Keyword for precision, semantic for recall

Implementation steps

1. Index documents:

Chunk documents
Generate embeddings
Store in vector database

2. Search:

Embed user query
Find k-nearest neighbors
Return results

3. Rank and display:

Optional reranking
Present to user

Choosing an embedding model

General-purpose:

OpenAI text-embedding-3-small
Sentence Transformers all-MiniLM-L6-v2

Specialized:

Domain-specific models (legal, medical)
Fine-tuned on your data

Considerations:

Speed vs accuracy
Cost
Language support

Measuring similarity

Cosine similarity:

Most common
-1 to 1 (1 = identical)

Dot product:

Faster, similar results
Requires normalized vectors

Euclidean distance:

Less common for text
Works for some embeddings

Optimizing performance

Indexing:

Use approximate nearest neighbor (ANN) algorithms
HNSW, IVF, Product Quantization
Trade accuracy for speed

Caching:

Cache common queries
Precompute popular results

Filtering:

Metadata filters before semantic search
Reduces search space

Use cases

Document search
Customer support (find similar tickets)
E-commerce (visual + text search)
Code search (find similar functions)
Legal research (case similarity)

Challenges

Embeddings don't capture all nuances
May miss exact-match requirements
Requires good embedding model
Computationally more expensive than keyword search

What's next

Embeddings Explained
Vector Databases
RAG Retrieval Strategies

Was this guide helpful?

Your feedback helps us improve our guides

Key Terms Used in This Guide

A text generation strategy where the AI explores multiple possible word sequences simultaneously and keeps the best few (the 'beam') at each step, resulting in higher-quality but slower output than greedy generation.