Paper page - Optimizing Length Compression in Large Reasoning Models

LC-R1, a post-training method guided by Brevity and Sufficiency principles, reduces unnecessary reasoning in Large Reasoning Models with minimal accuracy loss.

Large Reasoning Models (LRMs) have achieved remarkable success, yet they
often suffer from producing unnecessary and verbose reasoning chains. We
identify a core aspect of this issue as “invalid thinking” — models tend to
repeatedly double-check their work after having derived the correct answer. To
address this specific inefficiency, we move beyond the general principles of
Efficacy and Efficiency to propose two new, fine-grained principles: Brevity,
which advocates for eliminating redundancy, and Sufficiency, which ensures
critical reasoning steps are preserved. Guided by these principles, we
introduce LC-R1, a post-training method based on Group Relative Policy
Optimization (GRPO). LC-R1 employs a novel combination of a Length Reward for
overall conciseness and a Compress Reward that is specifically designed to
remove the invalid portion of the thinking process. Extensive experiments on
multiple reasoning benchmarks demonstrate that LC-R1 achieves a significant
reduction in sequence length (~50%) with only a marginal (~2%) drop in
accuracy, achieving a favorable trade-off point on the Pareto frontier that
prioritizes high compression. Our analysis further validates the robustness of
LC-R1 and provides valuable insights for developing more powerful yet
computationally efficient LRMs. Our code is released at
https://github.com/zxiangx/LC-R1.

Source link

What's Hot

AI disruption rises, VC optimism cools in H1 2025

Visualizing A Decade Of Private-Market Buildup

Disney’s AI Learns To Render Clouds | Two Minute Papers #204

Paper page – Optimizing Length Compression in Large Reasoning Models

Paper page – Ring-lite: Scalable Reasoning via C3PO-Stabilized Reinforcement Learning for LLMs

Paper page – Ambient Diffusion Omni: Training Good Models with Bad Data

Paper page – Ego-R1: Chain-of-Tool-Thought for Ultra-Long Egocentric Video Reasoning

Israeli Attacks on Palestinian Heritage Constitute War Crimes: Report

UOVO to Expand Facilities in Brooklyn

Former Sotheby’s Vet Launches Art Lending Firm with Nahmads’ Backing

Orange County Museum of Art Discusses Merger with UC Irvine

AI disruption rises, VC optimism cools in H1 2025

Visualizing A Decade Of Private-Market Buildup

Disney’s AI Learns To Render Clouds | Two Minute Papers #204

What's Hot

Paper page – Optimizing Length Compression in Large Reasoning Models

Related Posts

Subscribe to Updates