The math and code behind ContextCrunch
Six interactive concepts. Each explained at three levels — plain English, technical, and academic. Live demos powered by the real backend. Code shown alongside every equation.
Tokens & context windows
What is a token? Why do models forget? Why does Claude stop but ChatGPT keep going? Live tokenizer demo with real backend.
02Embeddings & semantic meaning
How does AI understand meaning? Enter sentences and see real cosine similarity computed from 384-dim embeddings.
03Entropy & information theory
How do we measure information vs noise? Paste any text and see real Shannon entropy plus character frequency visualization.
04Quantization
Scalar QT, Product QT, and Google's TurboQuant (ICLR 2026). How ContextCrunch achieves 9× memory reduction with near-zero accuracy loss.
05Attention & latency
Why O(n²)? Use the slider to feel how quadratic compute growth affects response time. See compression speedup in real time.
06Prompt efficiency
Paste any prompt. ContextCrunch rewrites it to say the same thing in fewer tokens using Groq Llama 3.3 70B and mutual information math.