← Back to Learn

Tokens & context windows

What is a token? Why do models forget? Each model uses a fundamentally different tokenizer — the same text produces different token counts across Claude, ChatGPT, and Gemini. Type below to see it live.

Live tokenizer — click pills to inspect BPE dimensions Type to tokenize
Tokens appear here...
Color = token boundary: token1st token2nd token3rd — each colored block = one token
0characters
0tokens
chars/token

How each model tokenizes and handles context

Claude
Full recall up to the context limit — re-reads everything every message. Tells you when it hits the limit. Opus 4.7 uses a dense BPE tokenizer that creates ~35% more tokens for the same words.
Haiku 4.5: 200k · Sonnet 4.6: 1M · Opus 4.7: 1M
Custom BPE · ~3.5 chars/token (Haiku/Sonnet) · ~2.6 (Opus 4.7)
ChatGPT
Silent truncation — drops oldest messages without telling you when the window fills. The only model with exact public token counts via tiktoken (cl100k).
Free (GPT-5.4 mini): 32k · Plus (GPT-5.4): 272k · Pro: 1.05M
cl100k BPE · ~4.0 chars/token · vocab 100,277
Gemini
Largest vocabulary = fewest tokens per word, so it uses its window more efficiently. AI Ultra / 3.1 Pro has a massive 2,000,000 token limit.
Lite (3.1 Lite): 1M · Pro (3.5 Flash): 1M · Ultra (3.1 Pro): 2M
SentencePiece unigram · ~4.5 chars/token · vocab 256,000