Transformer

Also known as: Transformer Architecture, Transformer Model

In one sentence

A neural network architecture that revolutionized AI by using attention mechanisms to understand relationships between words, enabling modern LLMs.

Explain like I'm 12

A type of AI brain design that can pay attention to all words in a sentence at once, figuring out how they relate to each other—like reading the whole page instead of one word at a time.

In context

The foundation of GPT, Claude, BERT, and most modern language models. Introduced in Google's 2017 'Attention Is All You Need' paper.

Related Guides

Learn more about Transformer in these guides:

Natural Language Processing: How AI Understands Text

Intermediate

NLP is how AI reads, understands, and generates human language. Learn the techniques behind chatbots, translation, and text analysis.

8 min read

AI Model Architectures: A High-Level Overview

Intermediate

From transformers to CNNs to diffusion models—understand the different AI architectures and what they're good at.

7 min read

Multimodal Models: Text + Image + Audio

Intermediate

AI that understands text, images, and audio together. How multimodal models work and what they enable.

10 min read

In one sentence

Explain like I'm 12

In context

See also

Related Guides

Natural Language Processing: How AI Understands Text

AI Model Architectures: A High-Level Overview

Multimodal Models: Text + Image + Audio