Paper Page - Kinetics: Rethinking Test-Time Scaling Laws

🥳 Happy to share our new work – Kinetics: Rethinking Test-Time Scaling Laws
🤔How to effectively build a powerful reasoning agent?
Existing compute-optimal scaling laws suggest 64K thinking tokens + 1.7B model > 32B model.
But, It only shows half of the picture!
🚨 The O(N²) KV memory access in self-attention dominates the cost of test-time scaling (TTS).
MoEs even worsen memory bottleneck by cutting compute.
Our new scaling law, Kinetics, suggests – invest in model size first before spending more in test-time compute.
This insight leads to our next key finding
✨ Sparse Attention = Scalable TTS
Our Kinetics sparse scaling law says that when doubling the resources, we should prioritize increasing test time tokens over attention density.
✅ 60+ points improvement under the same compute budget
✅ 10× lower resource usage for equivalent performance
✅ Sparse attention becomes increasingly valuable in high-cost scenarios
💡Sparsity is key to unlocking full potential of TTS, because unlike pretraining, where scaling shows diminishing returns, TTS continues to benefit from increased token generation and more optimized inference paths.

Arxiv link: https://arxiv.org/abs/2506.05333
Website: https://infini-ai-lab.github.io/Kinetics/
Twitter: https://x.com/InfiniAILab/status/1931053042876768586

Source link

What's Hot

Nvidia, OpenAI to spend billions on UK DCs: Report

Users turn to chatbots for spiritual guidance

Vibe coding has turned senior devs into ‘AI babysitters,’ but they say it’s worth it

Paper page – Kinetics: Rethinking Test-Time Scaling Laws

Research Paper – Takara TLDR

2D Gaussian Splatting with Semantic Alignment for Image Inpainting – Takara TLDR

The Choice of Divergence: A Neglected Key to Mitigating Diversity Collapse in Reinforcement Learning with Verifiable Reward – Takara TLDR

Ohio Auction of Two Paintings Looted By Nazis Halted By Foundation

Lee Ufan Painting at Center of Bribery Investigation in Korea

Drought Reveals 40 Ancient Tombs in Northern Iraqi Reservoir

Artifacts Removed from Gaza Building Before Suspected Israeli Strike

Nvidia, OpenAI to spend billions on UK DCs: Report

Users turn to chatbots for spiritual guidance

Vibe coding has turned senior devs into ‘AI babysitters,’ but they say it’s worth it

What's Hot

Paper page – Kinetics: Rethinking Test-Time Scaling Laws

Related Posts

Subscribe to Updates