Close Menu
  • Home
  • AI Models
    • DeepSeek
    • xAI
    • OpenAI
    • Meta AI Llama
    • Google DeepMind
    • Amazon AWS AI
    • Microsoft AI
    • Anthropic (Claude)
    • NVIDIA AI
    • IBM WatsonX Granite 3.1
    • Adobe Sensi
    • Hugging Face
    • Alibaba Cloud (Qwen)
    • Baidu (ERNIE)
    • C3 AI
    • DataRobot
    • Mistral AI
    • Moonshot AI (Kimi)
    • Google Gemma
    • xAI
    • Stability AI
    • H20.ai
  • AI Research
    • Allen Institue for AI
    • arXiv AI
    • Berkeley AI Research
    • CMU AI
    • Google Research
    • Microsoft Research
    • Meta AI Research
    • OpenAI Research
    • Stanford HAI
    • MIT CSAIL
    • Harvard AI
  • AI Funding & Startups
    • AI Funding Database
    • CBInsights AI
    • Crunchbase AI
    • Data Robot Blog
    • TechCrunch AI
    • VentureBeat AI
    • The Information AI
    • Sifted AI
    • WIRED AI
    • Fortune AI
    • PitchBook
    • TechRepublic
    • SiliconANGLE – Big Data
    • MIT News
    • Data Robot Blog
  • Expert Insights & Videos
    • Google DeepMind
    • Lex Fridman
    • Matt Wolfe AI
    • Yannic Kilcher
    • Two Minute Papers
    • AI Explained
    • TheAIEdge
    • Matt Wolfe AI
    • The TechLead
    • Andrew Ng
    • OpenAI
  • Expert Blogs
    • François Chollet
    • Gary Marcus
    • IBM
    • Jack Clark
    • Jeremy Howard
    • Melanie Mitchell
    • Andrew Ng
    • Andrej Karpathy
    • Sebastian Ruder
    • Rachel Thomas
    • IBM
  • AI Policy & Ethics
    • ACLU AI
    • AI Now Institute
    • Center for AI Safety
    • EFF AI
    • European Commission AI
    • Partnership on AI
    • Stanford HAI Policy
    • Mozilla Foundation AI
    • Future of Life Institute
    • Center for AI Safety
    • World Economic Forum AI
  • AI Tools & Product Releases
    • AI Assistants
    • AI for Recruitment
    • AI Search
    • Coding Assistants
    • Customer Service AI
    • Image Generation
    • Video Generation
    • Writing Tools
    • AI for Recruitment
    • Voice/Audio Generation
  • Industry Applications
    • Finance AI
    • Healthcare AI
    • Legal AI
    • Manufacturing AI
    • Media & Entertainment
    • Transportation AI
    • Education AI
    • Retail AI
    • Agriculture AI
    • Energy AI
  • AI Art & Entertainment
    • AI Art News Blog
    • Artvy Blog » AI Art Blog
    • Weird Wonderful AI Art Blog
    • The Chainsaw » AI Art
    • Artvy Blog » AI Art Blog
What's Hot

Nvidia and Snowflake Power Reka AI to Billion-Dollar Heights

OpenAI talks Oracle into another 2M GPUs worth of datacenter • The Register

Anthropic researchers discover the weird AI problem: Why thinking longer makes models dumber

Facebook X (Twitter) Instagram
Advanced AI News
  • Home
  • AI Models
    • OpenAI (GPT-4 / GPT-4o)
    • Anthropic (Claude 3)
    • Google DeepMind (Gemini)
    • Meta (LLaMA)
    • Cohere (Command R)
    • Amazon (Titan)
    • IBM (Watsonx)
    • Inflection AI (Pi)
  • AI Research
    • Allen Institue for AI
    • arXiv AI
    • Berkeley AI Research
    • CMU AI
    • Google Research
    • Meta AI Research
    • Microsoft Research
    • OpenAI Research
    • Stanford HAI
    • MIT CSAIL
    • Harvard AI
  • AI Funding
    • AI Funding Database
    • CBInsights AI
    • Crunchbase AI
    • Data Robot Blog
    • TechCrunch AI
    • VentureBeat AI
    • The Information AI
    • Sifted AI
    • WIRED AI
    • Fortune AI
    • PitchBook
    • TechRepublic
    • SiliconANGLE – Big Data
    • MIT News
    • Data Robot Blog
  • AI Experts
    • Google DeepMind
    • Lex Fridman
    • Meta AI Llama
    • Yannic Kilcher
    • Two Minute Papers
    • AI Explained
    • TheAIEdge
    • The TechLead
    • Matt Wolfe AI
    • Andrew Ng
    • OpenAI
    • Expert Blogs
      • François Chollet
      • Gary Marcus
      • IBM
      • Jack Clark
      • Jeremy Howard
      • Melanie Mitchell
      • Andrew Ng
      • Andrej Karpathy
      • Sebastian Ruder
      • Rachel Thomas
      • IBM
  • AI Tools
    • AI Assistants
    • AI for Recruitment
    • AI Search
    • Coding Assistants
    • Customer Service AI
  • AI Policy
    • ACLU AI
    • AI Now Institute
    • Center for AI Safety
  • Industry AI
    • Finance AI
    • Healthcare AI
    • Education AI
    • Energy AI
    • Legal AI
LinkedIn Instagram YouTube Threads X (Twitter)
Advanced AI News
OpenAI

DeepMind and OpenAI claim gold in International Mathematical Olympiad

By Advanced AI EditorJuly 22, 2025No Comments5 Mins Read
Share Facebook Twitter Pinterest Copy Link Telegram LinkedIn Tumblr Email
Share
Facebook Twitter LinkedIn Pinterest Email


AIs are getting better at maths problems

Andresr/ Getty Images

Experimental AI models from Google DeepMind and OpenAI have achieved a gold-level performance in the International Mathematical Olympiad (IMO) for the first time.

The companies are hailing the moment as an important milestone for AIs that might one day solve hard scientific or mathematical problems, but mathematicians are more cautious because details of the models’ results and how they work haven’t been made public.

The IMO, one of the world’s most prestigious competitions for young mathematicians, has long been seen by AI researchers as a litmus test for mathematical reasoning that AI systems tend to struggle with.

After last year’s competition held in Bath, UK, Google DeepMindannounced that AI systems it had developed, called AlphaProof and AlphaGeometry, had together achieved a silver medal-level performance, but its entries weren’t graded by the competition’s official markers.

Before this year’s contest, which was held in Queensland, Australia, companies including Google, Huawei and TikTok-owner ByteDance, as well as academic researchers, approached the organisers to ask whether they could have their AI models’ performance officially graded, says Gregor Dolinar, the IMO’s president. The IMO agreed, with the proviso that the companies waited to announce their results until 28 July, when the IMO’s full closing ceremonies had been completed.

OpenAI also asked if it could participate in the competition, but after it was informed about the official scheme, it didn’t respond or register an entry, says Dolinar.

On 19 July, OpenAI announced that a new AI it had developed had achieved a gold medal score marked by three former IMO medallists separate from the official competition. The AI answered five out of six questions correctly in the same 4.5-hour time limit as the contestants, OpenAI said.

Two days later, Google DeepMind also announced that its AI system, called Gemini Deep Think, had achieved gold with the same score and time limits. Dolinar confirmed that this result was given by the IMO’s official markers.

Unlike Google’s AlphaProof and AlphaGeometry systems, which were crafted especially for the competition and worked with questions and answers written in a computer programming language called Lean, both Google and OpenAI’s models this year worked entirely in natural language.

Working in Lean meant the AI’s output could be instantly checked for correctness, but it is harder for non-experts to read. Thang Luong at Google, who worked on Gemini Deep Think, says the natural language approach could produce more understandable answers, as well as being applicable to generally useful AI systems.

Luong says the ability to verify solutions in a large language model has been made possible thanks to progress with reinforcement learning, a training method in which an AI is taught what success looks like and is left to figure out the rules and how to succeed solely through trial and error. This method was key to Google’s previous success with its game-playing AIs, such as AlphaZero.

Google’s model also considers multiple solutions at once, in a mode called parallel thinking, as well as being trained on a dataset of maths problems specifically useful for the IMO, says Luong.

OpenAI has released few details on its system, apart from that it also uses reinforcement learning and “experimental research methods”.

“The progress is promising, but not performed in a controlled scientific fashion, and so I will not be able to assess it at this stage,” says Terence Tao at the University of California, Los Angeles. “Perhaps once the companies involved release some papers with more data, and hopefully enough access to the model for others to replicate the results, one can say something more definitive, but, for now, we largely have to trust the companies themselves for the claimed results.”

Geordie Williamson at the University of Sydney in Australia agrees. “I think it is remarkable that this is where we’re at. It is frustrating how little detail outsiders are provided with regarding internals,” says Williamson.

While systems working in natural language could be useful for non-mathematicians, it could also present a problem if models produce long proofs that are hard to check, says Joseph Myers, one of the organisers of this year’s IMO. “If AIs are ever to produce solutions to significant unsolved problems that might plausibly be correct but might also have a few subtle but fatal errors hidden accidentally, or potentially deliberately from a misaligned AI, having those AIs also generate a formal proof is key to having confidence in the correctness of a long AI output before attempting to read it.”

Both companies say that, in the coming months, they will offer these systems for testing to mathematicians at first, before releasing them to the wider public. The models could soon help with harder scientific research problems, says Junehyuk Jung at Google, who worked on Gemini Deep Think. “There are going to be many, many unsolved problems within reach,” he says.

Topics:



Source link

Follow on Google News Follow on Flipboard
Share. Facebook Twitter Pinterest LinkedIn Tumblr Email Copy Link
Previous ArticleRemoved Romanesque Murals Must Be Returned to Sijena Monastery
Next Article Two AIs Ace Math Olympiad
Advanced AI Editor
  • Website

Related Posts

OpenAI talks Oracle into another 2M GPUs worth of datacenter • The Register

July 23, 2025

OpenAI partners with Oracle to built out 4.5 gigawatts in data center capacity

July 22, 2025

OpenAI and UK sign deal to use AI in public services

July 22, 2025

Comments are closed.

Latest Posts

3,800-Year-Old Warrior’s Tomb Unearthed in Azerbaijan

Removed Romanesque Murals Must Be Returned to Sijena Monastery

President Trump Withdraws US from UNESCO

Morning Links for July 22, 2025

Latest Posts

Nvidia and Snowflake Power Reka AI to Billion-Dollar Heights

July 23, 2025

OpenAI talks Oracle into another 2M GPUs worth of datacenter • The Register

July 23, 2025

Anthropic researchers discover the weird AI problem: Why thinking longer makes models dumber

July 22, 2025

Subscribe to News

Subscribe to our newsletter and never miss our latest news

Subscribe my Newsletter for New Posts & tips Let's stay updated!

Recent Posts

  • Nvidia and Snowflake Power Reka AI to Billion-Dollar Heights
  • OpenAI talks Oracle into another 2M GPUs worth of datacenter • The Register
  • Anthropic researchers discover the weird AI problem: Why thinking longer makes models dumber
  • OpenAI agreed to pay Oracle $30B a year for data center services
  • Inside the Conference Shaping Frontier AI for Science

Recent Comments

  1. binance on OpenAI DALL-E: Fighter Jet For The Mind! ✈️
  2. JeffreyCoalo on Local gov’t reps say they look forward to working with Thomas
  3. Duanepiems on Orange County Museum of Art Discusses Merger with UC Irvine
  4. fpmarkGoods on How Cursor and Claude Are Developing AI Coding Tools Together
  5. avenue17 on Local gov’t reps say they look forward to working with Thomas

Welcome to Advanced AI News—your ultimate destination for the latest advancements, insights, and breakthroughs in artificial intelligence.

At Advanced AI News, we are passionate about keeping you informed on the cutting edge of AI technology, from groundbreaking research to emerging startups, expert insights, and real-world applications. Our mission is to deliver high-quality, up-to-date, and insightful content that empowers AI enthusiasts, professionals, and businesses to stay ahead in this fast-evolving field.

Subscribe to Updates

Subscribe to our newsletter and never miss our latest news

Subscribe my Newsletter for New Posts & tips Let's stay updated!

LinkedIn Instagram YouTube Threads X (Twitter)
  • Home
  • About Us
  • Advertise With Us
  • Contact Us
  • DMCA
  • Privacy Policy
  • Terms & Conditions
© 2025 advancedainews. Designed by advancedainews.

Type above and press Enter to search. Press Esc to cancel.