Skip to main content

Tokenizer

Also known as: Tokenization, Token Encoding

In one sentence

A tool that breaks text into smaller pieces (tokens) that an AI model can process. Different models use different tokenizers, affecting how they count and understand text.

Explain like I'm 12

It's like cutting a sandwich into bite-sized pieces so you can eat it. The tokenizer cuts your text into little chunks (tokens) so the AI can 'digest' and understand it.

In context

Example: 'Hello world' might become 2 tokens ('Hello' and ' world'), or 1 token ('Hello world'), or even 3 tokens ('Hel', 'lo', ' world') depending on the tokenizer. This affects costs since APIs charge per token.

See also

Related Guides

Learn more about Tokenizer in these guides: