Close Menu
  • Home
  • AI Models
    • DeepSeek
    • xAI
    • OpenAI
    • Meta AI Llama
    • Google DeepMind
    • Amazon AWS AI
    • Microsoft AI
    • Anthropic (Claude)
    • NVIDIA AI
    • IBM WatsonX Granite 3.1
    • Adobe Sensi
    • Hugging Face
    • Alibaba Cloud (Qwen)
    • Baidu (ERNIE)
    • C3 AI
    • DataRobot
    • Mistral AI
    • Moonshot AI (Kimi)
    • Google Gemma
    • xAI
    • Stability AI
    • H20.ai
  • AI Research
    • Allen Institue for AI
    • arXiv AI
    • Berkeley AI Research
    • CMU AI
    • Google Research
    • Microsoft Research
    • Meta AI Research
    • OpenAI Research
    • Stanford HAI
    • MIT CSAIL
    • Harvard AI
  • AI Funding & Startups
    • AI Funding Database
    • CBInsights AI
    • Crunchbase AI
    • Data Robot Blog
    • TechCrunch AI
    • VentureBeat AI
    • The Information AI
    • Sifted AI
    • WIRED AI
    • Fortune AI
    • PitchBook
    • TechRepublic
    • SiliconANGLE – Big Data
    • MIT News
    • Data Robot Blog
  • Expert Insights & Videos
    • Google DeepMind
    • Lex Fridman
    • Matt Wolfe AI
    • Yannic Kilcher
    • Two Minute Papers
    • AI Explained
    • TheAIEdge
    • Matt Wolfe AI
    • The TechLead
    • Andrew Ng
    • OpenAI
  • Expert Blogs
    • François Chollet
    • Gary Marcus
    • IBM
    • Jack Clark
    • Jeremy Howard
    • Melanie Mitchell
    • Andrew Ng
    • Andrej Karpathy
    • Sebastian Ruder
    • Rachel Thomas
    • IBM
  • AI Policy & Ethics
    • ACLU AI
    • AI Now Institute
    • Center for AI Safety
    • EFF AI
    • European Commission AI
    • Partnership on AI
    • Stanford HAI Policy
    • Mozilla Foundation AI
    • Future of Life Institute
    • Center for AI Safety
    • World Economic Forum AI
  • AI Tools & Product Releases
    • AI Assistants
    • AI for Recruitment
    • AI Search
    • Coding Assistants
    • Customer Service AI
    • Image Generation
    • Video Generation
    • Writing Tools
    • AI for Recruitment
    • Voice/Audio Generation
  • Industry Applications
    • Finance AI
    • Healthcare AI
    • Legal AI
    • Manufacturing AI
    • Media & Entertainment
    • Transportation AI
    • Education AI
    • Retail AI
    • Agriculture AI
    • Energy AI
  • AI Art & Entertainment
    • AI Art News Blog
    • Artvy Blog » AI Art Blog
    • Weird Wonderful AI Art Blog
    • The Chainsaw » AI Art
    • Artvy Blog » AI Art Blog
What's Hot

IBM App Connect Enterprise Toolkit can leak data

AI note-taking app Granola adds a repeatable prompts feature

Pennsylvania Turnpike launches AI-powered customer service chat on its website – WPXI

Facebook X (Twitter) Instagram
Advanced AI News
  • Home
  • AI Models
    • OpenAI (GPT-4 / GPT-4o)
    • Anthropic (Claude 3)
    • Google DeepMind (Gemini)
    • Meta (LLaMA)
    • Cohere (Command R)
    • Amazon (Titan)
    • IBM (Watsonx)
    • Inflection AI (Pi)
  • AI Research
    • Allen Institue for AI
    • arXiv AI
    • Berkeley AI Research
    • CMU AI
    • Google Research
    • Meta AI Research
    • Microsoft Research
    • OpenAI Research
    • Stanford HAI
    • MIT CSAIL
    • Harvard AI
  • AI Funding
    • AI Funding Database
    • CBInsights AI
    • Crunchbase AI
    • Data Robot Blog
    • TechCrunch AI
    • VentureBeat AI
    • The Information AI
    • Sifted AI
    • WIRED AI
    • Fortune AI
    • PitchBook
    • TechRepublic
    • SiliconANGLE – Big Data
    • MIT News
    • Data Robot Blog
  • AI Experts
    • Google DeepMind
    • Lex Fridman
    • Meta AI Llama
    • Yannic Kilcher
    • Two Minute Papers
    • AI Explained
    • TheAIEdge
    • The TechLead
    • Matt Wolfe AI
    • Andrew Ng
    • OpenAI
    • Expert Blogs
      • François Chollet
      • Gary Marcus
      • IBM
      • Jack Clark
      • Jeremy Howard
      • Melanie Mitchell
      • Andrew Ng
      • Andrej Karpathy
      • Sebastian Ruder
      • Rachel Thomas
      • IBM
  • AI Tools
    • AI Assistants
    • AI for Recruitment
    • AI Search
    • Coding Assistants
    • Customer Service AI
  • AI Policy
    • ACLU AI
    • AI Now Institute
    • Center for AI Safety
  • Business AI
    • Advanced AI News Features
    • Finance AI
    • Healthcare AI
    • Education AI
    • Energy AI
    • Legal AI
LinkedIn Instagram YouTube Threads X (Twitter)
Advanced AI News
Hugging Face

SIRI: Scaling Iterative Reinforcement Learning with Interleaved Compression – Takara TLDR

By Advanced AI EditorSeptember 30, 2025No Comments2 Mins Read
Share Facebook Twitter Pinterest Copy Link Telegram LinkedIn Tumblr Email
Share
Facebook Twitter LinkedIn Pinterest Email


We introduce SIRI, Scaling Iterative Reinforcement Learning with Interleaved
Compression, a simple yet effective RL approach for Large Reasoning Models
(LRMs) that enables more efficient and accurate reasoning. Existing studies
have observed repetitive thinking patterns in LRMs, and attempts to reduce them
often come at the cost of performance. In this paper, we show that this
trade-off can be overcome through a training regime that iteratively alternates
between compressing and expanding the reasoning budget, by dynamically
adjusting the maximum rollout length during training. The compression phase
cuts the rollout length, forcing the model to make precise and valuable
decisions within a limited context, which effectively reduces redundant tokens
and increases reasoning density. The expansion phase then relaxes the length
limit, providing space for the model to explore and plan in long-horizon
settings. Remarkably, we find that after each compression-expansion cycle, the
model’s performance improves even as its output length decreases, steadily
pushing it closer to the Pareto frontier in the performance-efficiency
trade-off. Training on DeepSeek-R1-Distill-Qwen-1.5B, SIRI-low improves
performance on AIME24 by 43.2% while reducing token usage by 46.9% after three
iterations, and SIRI-high achieves the highest accuracy compared to all other
methods (Figure 1). Our findings shed light on the potential of periodically
oscillating the LRM’s output truncation length during training to dynamically
balance exploration and efficiency in reasoning, converging towards an optimal
“sweet spot” between the two. Our models are publicly available.



Source link

Follow on Google News Follow on Flipboard
Share. Facebook Twitter Pinterest LinkedIn Tumblr Email Copy Link
Previous ArticleDeepSeek’s New AI Model Slashes Inference Costs by 50%
Next Article Hybrid AI Firm Covenant Launches Data Intelligence Platform – Artificial Lawyer
Advanced AI Editor
  • Website

Related Posts

EasySteer: A Unified Framework for High-Performance and Extensible LLM Steering – Takara TLDR

September 30, 2025

PixelCraft: A Multi-Agent System for High-Fidelity Visual Reasoning on Structured Images – Takara TLDR

September 30, 2025

Visual Jigsaw Post-Training Improves MLLMs – Takara TLDR

September 30, 2025

Comments are closed.

Latest Posts

Federal Judge Denies Motion to Dismiss by Kasseem ‘Swizz Beatz’ Dean in 1MBD Scandal Case

Picasso Museum in Paris Plans $59 M. Expansion with New Sculpture Park

Giverny Landscape by Monet Among Top Lots at Bonhams October Sale

You Can Now Borrow Solange’s Art Books from Her Library

Latest Posts

IBM App Connect Enterprise Toolkit can leak data

September 30, 2025

AI note-taking app Granola adds a repeatable prompts feature

September 30, 2025

Pennsylvania Turnpike launches AI-powered customer service chat on its website – WPXI

September 30, 2025

Subscribe to News

Subscribe to our newsletter and never miss our latest news

Subscribe my Newsletter for New Posts & tips Let's stay updated!

Recent Posts

  • IBM App Connect Enterprise Toolkit can leak data
  • AI note-taking app Granola adds a repeatable prompts feature
  • Pennsylvania Turnpike launches AI-powered customer service chat on its website – WPXI
  • Volcano Engine Launches Doubao-Seed Translation Model, Competing with GPT-4o to Accelerate the Implementation of Multilingual AI Applications
  • Eve Bags $103m, Hits $1bn+ Valuation – Artificial Lawyer

Recent Comments

  1. Claretta Redburn on Exclusive: AI Bests Virus Experts, Raising Biohazard Fears
  2. formula on 1-800-CHAT-GPT—12 Days of OpenAI: Day 10
  3. ArthurGlorm on 1-800-CHAT-GPT—12 Days of OpenAI: Day 10
  4. Michaelmeawn on 1-800-CHAT-GPT—12 Days of OpenAI: Day 10
  5. MichaelDex on 1-800-CHAT-GPT—12 Days of OpenAI: Day 10

Welcome to Advanced AI News—your ultimate destination for the latest advancements, insights, and breakthroughs in artificial intelligence.

At Advanced AI News, we are passionate about keeping you informed on the cutting edge of AI technology, from groundbreaking research to emerging startups, expert insights, and real-world applications. Our mission is to deliver high-quality, up-to-date, and insightful content that empowers AI enthusiasts, professionals, and businesses to stay ahead in this fast-evolving field.

Subscribe to Updates

Subscribe to our newsletter and never miss our latest news

Subscribe my Newsletter for New Posts & tips Let's stay updated!

LinkedIn Instagram YouTube Threads X (Twitter)
  • Home
  • About Us
  • Advertise With Us
  • Contact Us
  • DMCA
  • Privacy Policy
  • Terms & Conditions
© 2025 advancedainews. Designed by advancedainews.

Type above and press Enter to search. Press Esc to cancel.