TL;DR

Semantic search uses embeddings to find results by meaning, not just keywords. "How to fix a drip" matches "repairing leaky faucets" even without shared words.

How semantic search works

  1. Convert documents to embeddings (vectors)
  2. User searches: "best running shoes"
  3. Convert query to embedding
  4. Find nearest document vectors
  5. Return top matches

Keyword (BM25):

  • Matches exact words
  • "Python tutorial" finds "Python" + "tutorial"
  • Misses synonyms, concepts

Semantic:

  • Matches meaning
  • "Python tutorial" finds "Learn to code in Python"
  • Understands intent

Hybrid (best):

  • Combines both
  • Keyword for precision, semantic for recall

Implementation steps

1. Index documents:

2. Search:

  • Embed user query
  • Find k-nearest neighbors
  • Return results

3. Rank and display:

  • Optional reranking
  • Present to user

Choosing an embedding model

General-purpose:

  • OpenAI text-embedding-3-small
  • Sentence Transformers all-MiniLM-L6-v2

Specialized:

  • Domain-specific models (legal, medical)
  • Fine-tuned on your data

Considerations:

  • Speed vs accuracy
  • Cost
  • Language support

Measuring similarity

Cosine similarity:

  • Most common
  • -1 to 1 (1 = identical)

Dot product:

  • Faster, similar results
  • Requires normalized vectors

Euclidean distance:

  • Less common for text
  • Works for some embeddings

Optimizing performance

Indexing:

  • Use approximate nearest neighbor (ANN) algorithms
  • HNSW, IVF, Product Quantization
  • Trade accuracy for speed

Caching:

  • Cache common queries
  • Precompute popular results

Filtering:

  • Metadata filters before semantic search
  • Reduces search space

Use cases

  • Document search
  • Customer support (find similar tickets)
  • E-commerce (visual + text search)
  • Code search (find similar functions)
  • Legal research (case similarity)

Challenges

  • Embeddings don't capture all nuances
  • May miss exact-match requirements
  • Requires good embedding model
  • Computationally more expensive than keyword search

What's next

  • Embeddings Explained
  • Vector Databases
  • RAG Retrieval Strategies