← Back to Learn Learn · 01

Tokens & context windows

What is a token? Why do models forget? Each model uses a fundamentally different tokenizer — the same text produces different token counts across Claude, ChatGPT, and Gemini. Type below to see it live.

Live tokenizer — click pills to inspect BPE dimensions Type to tokenize

Tokens appear here...

Color = token boundary: token1st token2nd token3rd — each colored block = one token

0characters

0tokens

—chars/token

How each model tokenizes and handles context

Claude

Full recall up to the context limit — re-reads everything every message. Tells you when it hits the limit. Opus 4.7 uses a dense BPE tokenizer that creates ~35% more tokens for the same words.

Haiku 4.5: 200k · Sonnet 4.6: 1M · Opus 4.7: 1M

Custom BPE · ~3.5 chars/token (Haiku/Sonnet) · ~2.6 (Opus 4.7)

ChatGPT

Silent truncation — drops oldest messages without telling you when the window fills. The only model with exact public token counts via tiktoken (cl100k).

Free (GPT-5.4 mini): 32k · Plus (GPT-5.4): 272k · Pro: 1.05M

cl100k BPE · ~4.0 chars/token · vocab 100,277

Gemini

Largest vocabulary = fewest tokens per word, so it uses its window more efficiently. AI Ultra / 3.1 Pro has a massive 2,000,000 token limit.

Lite (3.1 Lite): 1M · Pro (3.5 Flash): 1M · Ultra (3.1 Pro): 2M

SentencePiece unigram · ~4.5 chars/token · vocab 256,000