← Back to Learn

Attention & latency

Why does longer context make AI slower? Self-attention scales as O(n²) — double your tokens and you quadruple the compute. Use the slider to feel the math.

O(n²) latency calculator — instant safe zone
20,000
10%
0.04× attention cost vs baseline
10% context window used
Fast response speed zone
ContextCrunch compression benefit
30%
Ask about attention — live AI explanation Powered by Groq

See the latency impact of your current conversation.

Try the tool →