Paper page - Thinkless: LLM Learns When to Think

Reasoning Language Models, capable of extended chain-of-thought reasoning,
have demonstrated remarkable performance on tasks requiring complex logical
inference. However, applying elaborate reasoning for all queries often results
in substantial computational inefficiencies, particularly when many problems
admit straightforward solutions. This motivates an open question: Can LLMs
learn when to think? To answer this, we propose Thinkless, a learnable
framework that empowers an LLM to adaptively select between short-form and
long-form reasoning, based on both task complexity and the model’s ability.
Thinkless is trained under a reinforcement learning paradigm and employs two
control tokens, for concise responses and for detailed
reasoning. At the core of our method is a Decoupled Group Relative Policy
Optimization (DeGRPO) algorithm, which decomposes the learning objective of
hybrid reasoning into two components: (1) a control token loss that governs the
selection of the reasoning mode, and (2) a response loss that improves the
accuracy of the generated answers. This decoupled formulation enables
fine-grained control over the contributions of each objective, stabilizing
training and effectively preventing collapse observed in vanilla GRPO.
Empirically, on several benchmarks such as Minerva Algebra, MATH-500, and
GSM8K, Thinkless is able to reduce the usage of long-chain thinking by 50% –
90%, significantly improving the efficiency of Reasoning Language Models. The
code is available at https://github.com/VainF/Thinkless

Source link

What's Hot

Paper page – Growing Transformers: Modular Composition and Layer-wise Expansion on a Frozen Substrate

Long-running execution flows now supported in Amazon Bedrock Flows in public preview

Alibaba-backed Moonshot AI launches a new open-source model, rivaling DeepSeek

Paper page – Thinkless: LLM Learns When to Think

Paper page – Growing Transformers: Modular Composition and Layer-wise Expansion on a Frozen Substrate

Paper page – PyVision: Agentic Vision with Dynamic Tooling

Paper page – Geometry Forcing: Marrying Video Diffusion and 3D Representation for Consistent World Modeling

Homeland Security Targets Chicago’s National Museum of Puerto Rican Arts & Culture

1,600-Year-Old Tomb of Mayan City’s Founding King Discovered in Belize

Centre Pompidou Cancels Caribbean Art Show, Raising Controversy

‘Night at the Museum’ Reboot in the Works

Paper page – Growing Transformers: Modular Composition and Layer-wise Expansion on a Frozen Substrate

Long-running execution flows now supported in Amazon Bedrock Flows in public preview

Alibaba-backed Moonshot AI launches a new open-source model, rivaling DeepSeek

What's Hot

Paper page – Thinkless: LLM Learns When to Think

Related Posts

Subscribe to Updates