Transformer
Also known as: Transformer Architecture, Transformer Model
In one sentence
A neural network architecture that revolutionized AI by using attention mechanisms to understand relationships between words, enabling modern LLMs.
Explain like I'm 12
A type of AI brain design that can pay attention to all words in a sentence at once, figuring out how they relate to each other—like reading the whole page instead of one word at a time.
In context
The foundation of GPT, Claude, BERT, and most modern language models. Introduced in Google's 2017 'Attention Is All You Need' paper.
See also
Related Guides
Learn more about Transformer in these guides:
Natural Language Processing: How AI Understands Text
IntermediateNLP is how AI reads, understands, and generates human language. Learn the techniques behind chatbots, translation, and text analysis.
8 min readAI Model Architectures: A High-Level Overview
IntermediateFrom transformers to CNNs to diffusion models—understand the different AI architectures and what they're good at.
7 min readMultimodal Models: Text + Image + Audio
IntermediateAI that understands text, images, and audio together. How multimodal models work and what they enable.
10 min read