Paper page - Does More Inference-Time Compute Really Help Robustness?

Recently, Zaremba et al. demonstrated that increasing inference-time
computation improves robustness in large proprietary reasoning LLMs. In this
paper, we first show that smaller-scale, open-source models (e.g., DeepSeek R1,
Qwen3, Phi-reasoning) can also benefit from inference-time scaling using a
simple budget forcing strategy. More importantly, we reveal and critically
examine an implicit assumption in prior work: intermediate reasoning steps are
hidden from adversaries. By relaxing this assumption, we identify an important
security risk, intuitively motivated and empirically verified as an inverse
scaling law: if intermediate reasoning steps become explicitly accessible,
increased inference-time computation consistently reduces model robustness.
Finally, we discuss practical scenarios where models with hidden reasoning
chains are still vulnerable to attacks, such as models with tool-integrated
reasoning and advanced reasoning extraction attacks. Our findings collectively
demonstrate that the robustness benefits of inference-time scaling depend
heavily on the adversarial setting and deployment context. We urge
practitioners to carefully weigh these subtle trade-offs before applying
inference-time scaling in security-sensitive, real-world applications.

Source link

What's Hot

NVIDIA’s Yejin Choi Joins Stanford HAI

IBM falls most in 15 months on tepid software sales

Google’s new Web Guide search experiment organizes results with AI

Paper page – Does More Inference-Time Compute Really Help Robustness?

Paper page – Can One Domain Help Others? A Data-Centric Study on Multi-Domain Reasoning via Reinforcement Learning

Paper page – Elevating 3D Models: High-Quality Texture and Geometry Refinement from a Low-Quality Model

Paper page – Pixels, Patterns, but No Poetry: To See The World like Humans

US Appeals Court Overturns $8.8 M. Trademark Judgement For Yuga Labs

Old Masters ‘Making a Comeback’ in London: Morning Links

Bill Proposed To Apply Anti-Money Laundering Regulations to Art Market

France’s Culture Minister to Go on Trial for Corruption

NVIDIA’s Yejin Choi Joins Stanford HAI

IBM falls most in 15 months on tepid software sales

Google’s new Web Guide search experiment organizes results with AI

What's Hot

Paper page – Does More Inference-Time Compute Really Help Robustness?

Related Posts

Subscribe to Updates