Close Menu
  • Home
  • AI Models
    • DeepSeek
    • xAI
    • OpenAI
    • Meta AI Llama
    • Google DeepMind
    • Amazon AWS AI
    • Microsoft AI
    • Anthropic (Claude)
    • NVIDIA AI
    • IBM WatsonX Granite 3.1
    • Adobe Sensi
    • Hugging Face
    • Alibaba Cloud (Qwen)
    • Baidu (ERNIE)
    • C3 AI
    • DataRobot
    • Mistral AI
    • Moonshot AI (Kimi)
    • Google Gemma
    • xAI
    • Stability AI
    • H20.ai
  • AI Research
    • Allen Institue for AI
    • arXiv AI
    • Berkeley AI Research
    • CMU AI
    • Google Research
    • Microsoft Research
    • Meta AI Research
    • OpenAI Research
    • Stanford HAI
    • MIT CSAIL
    • Harvard AI
  • AI Funding & Startups
    • AI Funding Database
    • CBInsights AI
    • Crunchbase AI
    • Data Robot Blog
    • TechCrunch AI
    • VentureBeat AI
    • The Information AI
    • Sifted AI
    • WIRED AI
    • Fortune AI
    • PitchBook
    • TechRepublic
    • SiliconANGLE – Big Data
    • MIT News
    • Data Robot Blog
  • Expert Insights & Videos
    • Google DeepMind
    • Lex Fridman
    • Matt Wolfe AI
    • Yannic Kilcher
    • Two Minute Papers
    • AI Explained
    • TheAIEdge
    • Matt Wolfe AI
    • The TechLead
    • Andrew Ng
    • OpenAI
  • Expert Blogs
    • François Chollet
    • Gary Marcus
    • IBM
    • Jack Clark
    • Jeremy Howard
    • Melanie Mitchell
    • Andrew Ng
    • Andrej Karpathy
    • Sebastian Ruder
    • Rachel Thomas
    • IBM
  • AI Policy & Ethics
    • ACLU AI
    • AI Now Institute
    • Center for AI Safety
    • EFF AI
    • European Commission AI
    • Partnership on AI
    • Stanford HAI Policy
    • Mozilla Foundation AI
    • Future of Life Institute
    • Center for AI Safety
    • World Economic Forum AI
  • AI Tools & Product Releases
    • AI Assistants
    • AI for Recruitment
    • AI Search
    • Coding Assistants
    • Customer Service AI
    • Image Generation
    • Video Generation
    • Writing Tools
    • AI for Recruitment
    • Voice/Audio Generation
  • Industry Applications
    • Finance AI
    • Healthcare AI
    • Legal AI
    • Manufacturing AI
    • Media & Entertainment
    • Transportation AI
    • Education AI
    • Retail AI
    • Agriculture AI
    • Energy AI
  • AI Art & Entertainment
    • AI Art News Blog
    • Artvy Blog » AI Art Blog
    • Weird Wonderful AI Art Blog
    • The Chainsaw » AI Art
    • Artvy Blog » AI Art Blog
What's Hot

How CrowdStrike’s 78-minute outage reshaped enterprise cybersecurity

Dia launches a skill gallery, Perplexity to add tasks to Comet

Layoffs Affect the Labor Market

Facebook X (Twitter) Instagram
Advanced AI News
  • Home
  • AI Models
    • OpenAI (GPT-4 / GPT-4o)
    • Anthropic (Claude 3)
    • Google DeepMind (Gemini)
    • Meta (LLaMA)
    • Cohere (Command R)
    • Amazon (Titan)
    • IBM (Watsonx)
    • Inflection AI (Pi)
  • AI Research
    • Allen Institue for AI
    • arXiv AI
    • Berkeley AI Research
    • CMU AI
    • Google Research
    • Meta AI Research
    • Microsoft Research
    • OpenAI Research
    • Stanford HAI
    • MIT CSAIL
    • Harvard AI
  • AI Funding
    • AI Funding Database
    • CBInsights AI
    • Crunchbase AI
    • Data Robot Blog
    • TechCrunch AI
    • VentureBeat AI
    • The Information AI
    • Sifted AI
    • WIRED AI
    • Fortune AI
    • PitchBook
    • TechRepublic
    • SiliconANGLE – Big Data
    • MIT News
    • Data Robot Blog
  • AI Experts
    • Google DeepMind
    • Lex Fridman
    • Meta AI Llama
    • Yannic Kilcher
    • Two Minute Papers
    • AI Explained
    • TheAIEdge
    • The TechLead
    • Matt Wolfe AI
    • Andrew Ng
    • OpenAI
    • Expert Blogs
      • François Chollet
      • Gary Marcus
      • IBM
      • Jack Clark
      • Jeremy Howard
      • Melanie Mitchell
      • Andrew Ng
      • Andrej Karpathy
      • Sebastian Ruder
      • Rachel Thomas
      • IBM
  • AI Tools
    • AI Assistants
    • AI for Recruitment
    • AI Search
    • Coding Assistants
    • Customer Service AI
  • AI Policy
    • ACLU AI
    • AI Now Institute
    • Center for AI Safety
  • Industry AI
    • Finance AI
    • Healthcare AI
    • Education AI
    • Energy AI
    • Legal AI
LinkedIn Instagram YouTube Threads X (Twitter)
Advanced AI News
OpenAI

OpenAI wins gold at prestigious math competition – why that matters more than you think

By Advanced AI EditorJuly 21, 2025No Comments5 Mins Read
Share Facebook Twitter Pinterest Copy Link Telegram LinkedIn Tumblr Email
Share
Facebook Twitter LinkedIn Pinterest Email


imo-banner-image-1

OpenAI

OpenAI has achieved a new milestone in the race to build AI models that can reason their way through complex math problems.

On Saturday, the company announced that one of its models achieved gold medal-level performance on the International Math Olympiad (IMO), widely regarded as the most prestigious and difficult math competition in the world. 

We achieved gold medal-level performance 🥇on the 2025 International Mathematical Olympiad with a general-purpose reasoning LLM!
Our model solved world-class math problems—at the level of top human contestants. A major milestone for AI and mathematics. https://t.co/u2RlFFavyT

— OpenAI (@OpenAI) July 19, 2025

Critically, the winning model wasn’t designed specifically to solve IMO problems, in the way that earlier systems like DeepMind’s AlphaGo — which famously beat the world’s leading Go player in 2016 — were trained on a massive dataset within a very narrow, task-specific domain. Rather, the winner was a general-purpose reasoning model, designed to think through problems methodically using natural language.

Also: Is ChatGPT down? You’re not alone. Here’s what OpenAI is saying

“This is an LLM doing math and not a specific formal math system,” OpenAI wrote in its X post. “It’s part of our main push towards general intelligence.”

(Disclosure: Ziff Davis, ZDNET’s parent company, filed an April 2025 lawsuit against OpenAI, alleging it infringed Ziff Davis copyrights in training and operating its AI systems. Ziff Davis also owns DownDetector.)  

Not much is known at this point about the identity of the model that was used. Alexander Wei, a researcher at OpenAI who led the IMO research, called it “an experimental reasoning LLM” in an X post, which included an illustration of a strawberry wreathed in a gold medal, suggesting it’s built atop the company’s o1 family of reasoning models, which debuted in September.

“To be clear: We’re releasing GPT-5 soon, but the model we used at IMO is a separate experimental model,” OpenAI added on X. “It uses new research techniques that will show up in future models — but we don’t plan to release a model with this level of capability for many months.”

How well did the model perform?

The IMO, which began in 1959, attracts around 50 contestants from more than 100 countries each year. 

Contestants must provide proof-based responses to a total of six questions over the course of two days. Those proofs are assessed by former IMO gold medalists, with unanimous consensus required for each final score. Fewer than 9% of participants achieve gold. 

According to Wei, OpenAI’s experimental model solved five out of the six problems and earned 35 out of 42 possible points (about 83%), earning a gold medal. Each proof comprised hundreds of lines of text, representing the individual steps the model took to work through its reasoning process. In keeping with the competition’s prohibition against the use of calculators or other external tools, OpenAI’s model had no access to the internet; it was purely reasoning through each of the problems step-by-step. 

Also: My 8 ChatGPT Agent tests produced only 1 near-perfect result – and a lot of alternative facts

The “model thinks for a long time,” Noam Brown, another OpenAI researcher involved in the research project, wrote in an X post. “o1 thought for seconds. Deep Research for minutes. This one thinks for hours. Importantly, it’s also more efficient with its thinking.”

Analysts had previously estimated that there was only an 18% chance that an AI system would win gold in the IMO by 2025, according to OpenAI. 

The big picture

For all of its impressive abilities, AI has long struggled with simple arithmetic and basic math word problems — tasks that one might think should be relatively straightforward for advanced algorithms. But unlike more narrow logical puzzles, math requires a level of abstract reasoning and conceptual juggling that has been beyond the reach of most AI systems. 

That’s been changing, however, at an extraordinarily rapid pace. A little over a year ago, AI models were still being assessed using grade school-level math benchmarks like the GSM8K. Reasoning models like o1 and DeepSeek’s R1 quickly excelled, first acing high school-level benchmarks like AIME and then advancing to the university level and beyond.

A capacity for high-level mathematics has become the gold standard for reasoning models, since even a small amount of hallucination or corner-cutting can very quickly and clearly ruin a model’s output. It’s easier to get away with when generating other kinds of responses, for example, providing help with a written essay, since they’re very often open to various kinds of interpretation.

Also: 5 tips for building foundation models for AI

OpenAI’s IMO gold medal shows that a scalable, general-purpose reasoning approach can surpass domain-specific models in tasks that have long been believed to be beyond the reach of current AI systems. As it turns out, you don’t need to build hyperfocused, AlphaGo-like models trained to do nothing but math; it’s enough to train them to parse language and carefully reason through their thought process, and if they’re given enough time, they’ll be able to build AI systems that are able to compete on par with world-class human mathematicians.

According to Brown, the current pace of innovation happening throughout the AI industry suggests that its mathematical and reasoning prowess will only grow from here. “I fully expect the trend to continue,” he wrote on X. “Importantly, I think we’re close to AI substantially contributing to scientific discovery.”

Want more stories about AI? Sign up for Innovation, our weekly newsletter.



Source link

Follow on Google News Follow on Flipboard
Share. Facebook Twitter Pinterest LinkedIn Tumblr Email Copy Link
Previous ArticleThis AI Paper from Alibaba Introduces Lumos-1: A Unified Autoregressive Video Generator Leveraging MM-RoPE and AR-DF for Efficient Spatiotemporal Modeling
Next Article Google DeepMind to fund CASP, as NIH funding runs out
Advanced AI Editor
  • Website

Related Posts

OpenAI claims gold medal performance at prestigious math competition, drama ensues

July 21, 2025

OpenAI’s new model cracks world’s hardest math exam, stuns experts

July 21, 2025

JPMorgan Says OpenAI’s ‘Vibe Spending’ Could Test Investors

July 21, 2025

Comments are closed.

Latest Posts

Nonprofit Files Case Accusing Russia of Plundering Ukrainian Culture

Artist Raymond Saunders Dies at 90

Famous $6.2 M. Banana from Maurizio Cattelan’s ‘Comedian’ Eaten Again

Fine Arts Museums of San Francisco Lay Off 12 Staff

Latest Posts

How CrowdStrike’s 78-minute outage reshaped enterprise cybersecurity

July 21, 2025

Dia launches a skill gallery, Perplexity to add tasks to Comet

July 21, 2025

Layoffs Affect the Labor Market

July 21, 2025

Subscribe to News

Subscribe to our newsletter and never miss our latest news

Subscribe my Newsletter for New Posts & tips Let's stay updated!

Recent Posts

  • How CrowdStrike’s 78-minute outage reshaped enterprise cybersecurity
  • Dia launches a skill gallery, Perplexity to add tasks to Comet
  • Layoffs Affect the Labor Market
  • Use generative AI in Amazon Bedrock for enhanced recommendation generation in equipment maintenance
  • IBM and Researchers Are Now Closer to Practical Quantum Computers

Recent Comments

  1. fpmarkGoods on How Cursor and Claude Are Developing AI Coding Tools Together
  2. avenue17 on Local gov’t reps say they look forward to working with Thomas
  3. Lucky Star on Former Tesla AI czar Andrej Karpathy coins ‘vibe coding’: Here’s what it means
  4. микрокредит on Former Tesla AI czar Andrej Karpathy coins ‘vibe coding’: Here’s what it means
  5. www.binance.com注册 on MGX, Bpifrance, Nvidia, and Mistral AI plan 1.4GW Paris data center campus

Welcome to Advanced AI News—your ultimate destination for the latest advancements, insights, and breakthroughs in artificial intelligence.

At Advanced AI News, we are passionate about keeping you informed on the cutting edge of AI technology, from groundbreaking research to emerging startups, expert insights, and real-world applications. Our mission is to deliver high-quality, up-to-date, and insightful content that empowers AI enthusiasts, professionals, and businesses to stay ahead in this fast-evolving field.

Subscribe to Updates

Subscribe to our newsletter and never miss our latest news

Subscribe my Newsletter for New Posts & tips Let's stay updated!

LinkedIn Instagram YouTube Threads X (Twitter)
  • Home
  • About Us
  • Advertise With Us
  • Contact Us
  • DMCA
  • Privacy Policy
  • Terms & Conditions
© 2025 advancedainews. Designed by advancedainews.

Type above and press Enter to search. Press Esc to cancel.