Control-R: Towards Controllable Test-time Scaling

arXiv:2506.00189v1 Announce Type: new
Abstract: This paper target in addressing the challenges of underthinking and overthinking in long chain-of-thought (CoT) reasoning for Large Reasoning Models (LRMs) by introducing Reasoning Control Fields (RCF)–a novel test-time approach that injects structured control signals to guide reasoning from a tree search perspective. RCF enables models to adjust reasoning effort according to given control conditions when solving complex tasks. Additionally, we present the Control-R-4K dataset, which consists of challenging problems annotated with detailed reasoning processes and corresponding control fields. To further enhance reasoning control, we propose a Conditional Distillation Finetuning (CDF) method, which trains model–particularly Control-R-32B–to effectively adjust reasoning effort during test time. Experimental results on benchmarks such as AIME2024 and MATH500 demonstrate that our approach achieves state-of-the-art performance at the 32B scale while enabling a controllable Long CoT reasoning process (L-CoT). Overall, this work introduces an effective paradigm for controllable test-time scaling reasoning.

Source link

What's Hot

When AI Meets Biology: Promise, Risk, and Responsibility

Beyond Von Neumann: Toward a unified deterministic architecture

A 19-year-old nabs backing from Google execs for his AI memory startup, Supermemory

Control-R: Towards controllable test-time scaling

LTLCrit: A Temporal Logic-based LLM Critic for Safe and Efficient Embodied Agents

From Imitation to Innovation: The Emergence of AI Unique Artistic Styles and the Challenge of Copyright Protection

VerifyLLM: LLM-Based Pre-Execution Task Plan Verification for Robots

Morning Links for October 6, 2025

Sotheby’s to Sell René Magritte Held in Same Collection for 100 years

Former ARTnews Publisher Dies at 97

National Gallery of Art Closes as a Result of Government Shutdown

When AI Meets Biology: Promise, Risk, and Responsibility

Beyond Von Neumann: Toward a unified deterministic architecture

A 19-year-old nabs backing from Google execs for his AI memory startup, Supermemory

What's Hot

Control-R: Towards controllable test-time scaling

Related Posts

Subscribe to Updates