TL;DR

Embeddings are numerical representations of text where similar meanings have similar numbers. They power semantic search, recommendations, and RAG systems by capturing meaning mathematically.

What are embeddings?

Simple explanation:
Embeddings convert words, sentences, or documents into arrays of numbers (vectors) that represent meaning.

Example:

  • "king" → [0.2, 0.8, -0.3, ...]
  • "queen" → [0.19, 0.79, -0.25, ...] (similar!)
  • "banana" → [-0.5, 0.1, 0.9, ...] (different)

Similar concepts cluster together in vector space.

Why embeddings matter

Semantic search:

  • Find documents by meaning, not just keywords
  • "How to fix a leaky faucet" matches "plumbing repairs"

Recommendations:

  • "Similar items" based on meaning
  • Works across languages

RAG systems:

  • Find relevant context for LLM prompts
  • Critical for building AI apps

How embeddings work

  1. Train a model on billions of words
  2. Learn relationships (king - man + woman ≈ queen)
  3. Encode text into fixed-size vectors
  4. Measure similarity using math (cosine similarity)
  • OpenAI embeddings: text-embedding-3-small, text-embedding-3-large
  • Sentence Transformers: all-MiniLM-L6-v2, all-mpnet-base-v2
  • Google: Universal Sentence Encoder
  • Cohere: embed-english-v3.0

Embedding dimensions

  • Small models: 384-768 dimensions (fast, less accurate)
  • Large models: 1024-1536 dimensions (slower, more accurate)
  • Trade-off between speed and quality

Use cases

  • Semantic search engines
  • Document clustering
  • Recommendation systems
  • Duplicate detection
  • Anomaly detection
  • RAG pipelines

What's next

  • Vector Databases
  • RAG Systems
  • Semantic Search