← Back to Learn
Learn · 05
Attention & latency
Why does longer context make AI slower? Self-attention scales as O(n²) — double your tokens and you quadruple the compute. Use the slider to feel the math.
O(n²) latency calculator — instant
safe zone
0.04×
attention cost vs baseline
10%
context window used
Fast
response speed zone
ContextCrunch compression benefit
Ask about attention — live AI explanation
Powered by Groq
See the latency impact of your current conversation.
Try the tool →