Close Menu
  • Home
  • AI Models
    • DeepSeek
    • xAI
    • OpenAI
    • Meta AI Llama
    • Google DeepMind
    • Amazon AWS AI
    • Microsoft AI
    • Anthropic (Claude)
    • NVIDIA AI
    • IBM WatsonX Granite 3.1
    • Adobe Sensi
    • Hugging Face
    • Alibaba Cloud (Qwen)
    • Baidu (ERNIE)
    • C3 AI
    • DataRobot
    • Mistral AI
    • Moonshot AI (Kimi)
    • Google Gemma
    • xAI
    • Stability AI
    • H20.ai
  • AI Research
    • Allen Institue for AI
    • arXiv AI
    • Berkeley AI Research
    • CMU AI
    • Google Research
    • Microsoft Research
    • Meta AI Research
    • OpenAI Research
    • Stanford HAI
    • MIT CSAIL
    • Harvard AI
  • AI Funding & Startups
    • AI Funding Database
    • CBInsights AI
    • Crunchbase AI
    • Data Robot Blog
    • TechCrunch AI
    • VentureBeat AI
    • The Information AI
    • Sifted AI
    • WIRED AI
    • Fortune AI
    • PitchBook
    • TechRepublic
    • SiliconANGLE – Big Data
    • MIT News
    • Data Robot Blog
  • Expert Insights & Videos
    • Google DeepMind
    • Lex Fridman
    • Matt Wolfe AI
    • Yannic Kilcher
    • Two Minute Papers
    • AI Explained
    • TheAIEdge
    • Matt Wolfe AI
    • The TechLead
    • Andrew Ng
    • OpenAI
  • Expert Blogs
    • François Chollet
    • Gary Marcus
    • IBM
    • Jack Clark
    • Jeremy Howard
    • Melanie Mitchell
    • Andrew Ng
    • Andrej Karpathy
    • Sebastian Ruder
    • Rachel Thomas
    • IBM
  • AI Policy & Ethics
    • ACLU AI
    • AI Now Institute
    • Center for AI Safety
    • EFF AI
    • European Commission AI
    • Partnership on AI
    • Stanford HAI Policy
    • Mozilla Foundation AI
    • Future of Life Institute
    • Center for AI Safety
    • World Economic Forum AI
  • AI Tools & Product Releases
    • AI Assistants
    • AI for Recruitment
    • AI Search
    • Coding Assistants
    • Customer Service AI
    • Image Generation
    • Video Generation
    • Writing Tools
    • AI for Recruitment
    • Voice/Audio Generation
  • Industry Applications
    • Finance AI
    • Healthcare AI
    • Legal AI
    • Manufacturing AI
    • Media & Entertainment
    • Transportation AI
    • Education AI
    • Retail AI
    • Agriculture AI
    • Energy AI
  • AI Art & Entertainment
    • AI Art News Blog
    • Artvy Blog » AI Art Blog
    • Weird Wonderful AI Art Blog
    • The Chainsaw » AI Art
    • Artvy Blog » AI Art Blog
What's Hot

Spectral Scaling Laws in Language Models: How Effectively Do Feed-Forward Networks Use Their Latent Space? – Takara TLDR

DeepSeek Launches V3.2-Exp, Targets Cost and Long-Text Performance

OpenAI’s first device with Jony Ive could be delayed due to ‘technical issues’

Facebook X (Twitter) Instagram
Advanced AI News
  • Home
  • AI Models
    • OpenAI (GPT-4 / GPT-4o)
    • Anthropic (Claude 3)
    • Google DeepMind (Gemini)
    • Meta (LLaMA)
    • Cohere (Command R)
    • Amazon (Titan)
    • IBM (Watsonx)
    • Inflection AI (Pi)
  • AI Research
    • Allen Institue for AI
    • arXiv AI
    • Berkeley AI Research
    • CMU AI
    • Google Research
    • Meta AI Research
    • Microsoft Research
    • OpenAI Research
    • Stanford HAI
    • MIT CSAIL
    • Harvard AI
  • AI Funding
    • AI Funding Database
    • CBInsights AI
    • Crunchbase AI
    • Data Robot Blog
    • TechCrunch AI
    • VentureBeat AI
    • The Information AI
    • Sifted AI
    • WIRED AI
    • Fortune AI
    • PitchBook
    • TechRepublic
    • SiliconANGLE – Big Data
    • MIT News
    • Data Robot Blog
  • AI Experts
    • Google DeepMind
    • Lex Fridman
    • Meta AI Llama
    • Yannic Kilcher
    • Two Minute Papers
    • AI Explained
    • TheAIEdge
    • The TechLead
    • Matt Wolfe AI
    • Andrew Ng
    • OpenAI
    • Expert Blogs
      • François Chollet
      • Gary Marcus
      • IBM
      • Jack Clark
      • Jeremy Howard
      • Melanie Mitchell
      • Andrew Ng
      • Andrej Karpathy
      • Sebastian Ruder
      • Rachel Thomas
      • IBM
  • AI Tools
    • AI Assistants
    • AI for Recruitment
    • AI Search
    • Coding Assistants
    • Customer Service AI
  • AI Policy
    • ACLU AI
    • AI Now Institute
    • Center for AI Safety
  • Business AI
    • Advanced AI News Features
    • Finance AI
    • Healthcare AI
    • Education AI
    • Energy AI
    • Legal AI
LinkedIn Instagram YouTube Threads X (Twitter)
Advanced AI News
TechCrunch AI

The Reinforcement Gap — or why some AI skills improve faster than others  

By Advanced AI EditorOctober 5, 2025No Comments4 Mins Read
Share Facebook Twitter Pinterest Copy Link Telegram LinkedIn Tumblr Email
Share
Facebook Twitter LinkedIn Pinterest Email


AI coding tools are getting better fast. If you don’t work in code, it can be hard to notice how much things are changing, but GPT-5 and Gemini 2.5 have made a whole new set of developer tricks possible to automate, and last week Sonnet 2.4 did it again.  

At the same time, other skills are progressing more slowly. If you are using AI to write emails, you’re probably getting the same value out of it you did a year ago. Even when the model gets better, the product doesn’t always benefit — particularly when the product is a chatbot that’s doing a dozen different jobs at the same time. AI is still making progress, but it’s not as evenly distributed as it used to be. 

The difference in progress is simpler than it seems. Coding apps are benefitting from billions of easily measurable tests, which can train them to produce workable code. This is reinforcement learning (RL), arguably the biggest driver of AI progress over the past six months and getting more intricate all the time. You can do reinforcement learning with human graders, but it works best if there’s a clear pass-fail metric, so you can repeat it billions of times without having to stop for human input.  

As the industry relies increasingly on reinforcement learning to improve products, we’re seeing a real difference between capabilities that can be automatically graded and the ones that can’t. RL-friendly skills like bug-fixing and competitive math are getting better fast, while skills like writing make only incremental progress. 

In short, there’s a reinforcement gap — and it’s becoming one of the most important factors for what AI systems can and can’t do. 

In some ways, software development is the perfect subject for reinforcement learning. Even before AI, there was a whole sub-discipline devoted to testing how software would hold up under pressure — largely because developers needed to make sure their code wouldn’t break before they deployed it. So even the most elegant code still needs to pass through unit testing, integration testing, security testing, and so on. Human developers use these tests routinely to validate their code and, as Google’s senior director for dev tools recently told me, they’re just as useful for validating AI-generated code. Even more than that, they’re useful for reinforcement learning, since they’re already systematized and repeatable at a massive scale. 

There’s no easy way to validate a well-written email or a good chatbot response; these skills are inherently subjective and harder to measure at scale. But not every task falls neatly into “easy to test” or “hard to test” categories. We don’t have an out-of-the-box testing kit for quarterly financial reports or actuarial science, but a well-capitalized accounting startup could probably build one from scratch. Some testing kits will work better than others, of course, and some companies will be smarter about how to approach the problem. But the testability of the underlying process is going to be the deciding factor in whether the underlying process can be made into a functional product instead of just an exciting demo.  

Techcrunch event

San Francisco
|
October 27-29, 2025

Some processes turn out to be more testable than you might think. If you’d asked me last week, I would have put AI-generated video in the “hard to test” category, but the immense progress made by OpenAI’s new Sora 2 model shows it may not be as hard as it looks. In Sora 2, objects no longer appear and disappear out of nowhere. Faces hold their shape, looking like a specific person rather than just a collection of features. Sora 2 footage respects the laws of physics in both obvious and subtle ways. I suspect that, if you peeked behind the curtain, you’d find a robust reinforcement learning system for each of these qualities. Put together, they make the difference between photorealism and an entertaining hallucination. 

To be clear, this isn’t a hard and fast rule of artificial intelligence. It’s a result of the central role reinforcement learning is playing in AI development, which could easily change as models develop. But as long as RL is the primary tool for bringing AI products to market, the reinforcement gap will only grow bigger — with serious implications for both startups and the economy at large. If a process ends up on the right side of the reinforcement gap, startups will probably succeed in automating it — and anyone doing that work now may end up looking for a new career. The question of which healthcare services are RL-trainable, for instance, has enormous implications for the shape of the economy over the next 20 years. And if surprises like Sora 2 are any indication, we may not have to wait long for an answer.



Source link

Follow on Google News Follow on Flipboard
Share. Facebook Twitter Pinterest LinkedIn Tumblr Email Copy Link
Previous ArticleRethinking Thinking Tokens: LLMs as Improvement Operators – Takara TLDR
Next Article OpenAI’s first device with Jony Ive could be delayed due to ‘technical issues’
Advanced AI Editor
  • Website

Related Posts

OpenAI and Jony Ive may be struggling to figure out their AI device

October 5, 2025

Sam Altman says Sora will add ‘granular,’ opt-in copyright controls

October 4, 2025

Instacrops will demo its water-saving, crop-boosting AI at TechCrunch Disrupt 2025

October 4, 2025

Comments are closed.

Latest Posts

Former ARTnews Publisher Dies at 97

National Gallery of Art Closes as a Result of Government Shutdown

Almine Rech Closes London Gallery After More Than a Decade

Record Exec and Art Collector Gets Over 4 Years

Latest Posts

Spectral Scaling Laws in Language Models: How Effectively Do Feed-Forward Networks Use Their Latent Space? – Takara TLDR

October 5, 2025

DeepSeek Launches V3.2-Exp, Targets Cost and Long-Text Performance

October 5, 2025

OpenAI’s first device with Jony Ive could be delayed due to ‘technical issues’

October 5, 2025

Subscribe to News

Subscribe to our newsletter and never miss our latest news

Subscribe my Newsletter for New Posts & tips Let's stay updated!

Recent Posts

  • Spectral Scaling Laws in Language Models: How Effectively Do Feed-Forward Networks Use Their Latent Space? – Takara TLDR
  • DeepSeek Launches V3.2-Exp, Targets Cost and Long-Text Performance
  • OpenAI’s first device with Jony Ive could be delayed due to ‘technical issues’
  • The Reinforcement Gap — or why some AI skills improve faster than others  
  • Rethinking Thinking Tokens: LLMs as Improvement Operators – Takara TLDR

Recent Comments

  1. Zacharyfup on AI Learns Facial Animation in VR
  2. EdwardPlews on AI Learns Facial Animation in VR
  3. Zacharyfup on Gemma 3N: Google’s Latest On Device Mobile AI Model
  4. EdwardPlews on Gemma 3N: Google’s Latest On Device Mobile AI Model
  5. Alfreda Woehl on Stanford HAI’s 2025 AI Index Reveals Record Growth in AI Capabilities, Investment, and Regulation

Welcome to Advanced AI News—your ultimate destination for the latest advancements, insights, and breakthroughs in artificial intelligence.

At Advanced AI News, we are passionate about keeping you informed on the cutting edge of AI technology, from groundbreaking research to emerging startups, expert insights, and real-world applications. Our mission is to deliver high-quality, up-to-date, and insightful content that empowers AI enthusiasts, professionals, and businesses to stay ahead in this fast-evolving field.

Subscribe to Updates

Subscribe to our newsletter and never miss our latest news

Subscribe my Newsletter for New Posts & tips Let's stay updated!

LinkedIn Instagram YouTube Threads X (Twitter)
  • Home
  • About Us
  • Advertise With Us
  • Contact Us
  • DMCA
  • Privacy Policy
  • Terms & Conditions
© 2025 advancedainews. Designed by advancedainews.

Type above and press Enter to search. Press Esc to cancel.