Close Menu
  • Home
  • AI Models
    • DeepSeek
    • xAI
    • OpenAI
    • Meta AI Llama
    • Google DeepMind
    • Amazon AWS AI
    • Microsoft AI
    • Anthropic (Claude)
    • NVIDIA AI
    • IBM WatsonX Granite 3.1
    • Adobe Sensi
    • Hugging Face
    • Alibaba Cloud (Qwen)
    • Baidu (ERNIE)
    • C3 AI
    • DataRobot
    • Mistral AI
    • Moonshot AI (Kimi)
    • Google Gemma
    • xAI
    • Stability AI
    • H20.ai
  • AI Research
    • Allen Institue for AI
    • arXiv AI
    • Berkeley AI Research
    • CMU AI
    • Google Research
    • Microsoft Research
    • Meta AI Research
    • OpenAI Research
    • Stanford HAI
    • MIT CSAIL
    • Harvard AI
  • AI Funding & Startups
    • AI Funding Database
    • CBInsights AI
    • Crunchbase AI
    • Data Robot Blog
    • TechCrunch AI
    • VentureBeat AI
    • The Information AI
    • Sifted AI
    • WIRED AI
    • Fortune AI
    • PitchBook
    • TechRepublic
    • SiliconANGLE – Big Data
    • MIT News
    • Data Robot Blog
  • Expert Insights & Videos
    • Google DeepMind
    • Lex Fridman
    • Matt Wolfe AI
    • Yannic Kilcher
    • Two Minute Papers
    • AI Explained
    • TheAIEdge
    • Matt Wolfe AI
    • The TechLead
    • Andrew Ng
    • OpenAI
  • Expert Blogs
    • François Chollet
    • Gary Marcus
    • IBM
    • Jack Clark
    • Jeremy Howard
    • Melanie Mitchell
    • Andrew Ng
    • Andrej Karpathy
    • Sebastian Ruder
    • Rachel Thomas
    • IBM
  • AI Policy & Ethics
    • ACLU AI
    • AI Now Institute
    • Center for AI Safety
    • EFF AI
    • European Commission AI
    • Partnership on AI
    • Stanford HAI Policy
    • Mozilla Foundation AI
    • Future of Life Institute
    • Center for AI Safety
    • World Economic Forum AI
  • AI Tools & Product Releases
    • AI Assistants
    • AI for Recruitment
    • AI Search
    • Coding Assistants
    • Customer Service AI
    • Image Generation
    • Video Generation
    • Writing Tools
    • AI for Recruitment
    • Voice/Audio Generation
  • Industry Applications
    • Finance AI
    • Healthcare AI
    • Legal AI
    • Manufacturing AI
    • Media & Entertainment
    • Transportation AI
    • Education AI
    • Retail AI
    • Agriculture AI
    • Energy AI
  • AI Art & Entertainment
    • AI Art News Blog
    • Artvy Blog » AI Art Blog
    • Weird Wonderful AI Art Blog
    • The Chainsaw » AI Art
    • Artvy Blog » AI Art Blog
What's Hot

Stanford HAI’s 2025 AI Index Reveals Record Growth in AI Capabilities, Investment, and Regulation

New MIT CSAIL study suggests that AI won’t steal as many jobs as expected

Pittsburgh weekly roundup: Axios-OpenAI partnership; Buttigieg visits CMU; AI ‘employees’ in the nonprofit industry

Facebook X (Twitter) Instagram
Advanced AI News
  • Home
  • AI Models
    • Adobe Sensi
    • Aleph Alpha
    • Alibaba Cloud (Qwen)
    • Amazon AWS AI
    • Anthropic (Claude)
    • Apple Core ML
    • Baidu (ERNIE)
    • ByteDance Doubao
    • C3 AI
    • Cohere
    • DataRobot
    • DeepSeek
  • AI Research & Breakthroughs
    • Allen Institue for AI
    • arXiv AI
    • Berkeley AI Research
    • CMU AI
    • Google Research
    • Meta AI Research
    • Microsoft Research
    • OpenAI Research
    • Stanford HAI
    • MIT CSAIL
    • Harvard AI
  • AI Funding & Startups
    • AI Funding Database
    • CBInsights AI
    • Crunchbase AI
    • Data Robot Blog
    • TechCrunch AI
    • VentureBeat AI
    • The Information AI
    • Sifted AI
    • WIRED AI
    • Fortune AI
    • PitchBook
    • TechRepublic
    • SiliconANGLE – Big Data
    • MIT News
    • Data Robot Blog
  • Expert Insights & Videos
    • Google DeepMind
    • Lex Fridman
    • Meta AI Llama
    • Yannic Kilcher
    • Two Minute Papers
    • AI Explained
    • TheAIEdge
    • Matt Wolfe AI
    • The TechLead
    • Andrew Ng
    • OpenAI
  • Expert Blogs
    • François Chollet
    • Gary Marcus
    • IBM
    • Jack Clark
    • Jeremy Howard
    • Melanie Mitchell
    • Andrew Ng
    • Andrej Karpathy
    • Sebastian Ruder
    • Rachel Thomas
    • IBM
  • AI Policy & Ethics
    • ACLU AI
    • AI Now Institute
    • Center for AI Safety
    • EFF AI
    • European Commission AI
    • Partnership on AI
    • Stanford HAI Policy
    • Mozilla Foundation AI
    • Future of Life Institute
    • Center for AI Safety
    • World Economic Forum AI
  • AI Tools & Product Releases
    • AI Assistants
    • AI for Recruitment
    • AI Search
    • Coding Assistants
    • Customer Service AI
    • Image Generation
    • Video Generation
    • Writing Tools
    • AI for Recruitment
    • Voice/Audio Generation
  • Industry Applications
    • Education AI
    • Energy AI
    • Finance AI
    • Healthcare AI
    • Legal AI
    • Media & Entertainment
    • Transportation AI
    • Manufacturing AI
    • Retail AI
    • Agriculture AI
  • AI Art & Entertainment
    • AI Art News Blog
    • Artvy Blog » AI Art Blog
    • Weird Wonderful AI Art Blog
    • The Chainsaw » AI Art
    • Artvy Blog » AI Art Blog
Advanced AI News
Home » Will A.I. Soon Outsmart Humans? Play This Puzzle to Find Out.
François Chollet

Will A.I. Soon Outsmart Humans? Play This Puzzle to Find Out.

Advanced AI BotBy Advanced AI BotMarch 26, 2025No Comments7 Mins Read
Share Facebook Twitter Pinterest Copy Link Telegram LinkedIn Tumblr Email
Share
Facebook Twitter LinkedIn Pinterest Email


In 2019, an A.I. researcher, François Chollet, designed a puzzle game that was meant to be easy for humans but hard for machines.

The game, called ARC, became an important way for experts to track the progress of artificial intelligence and push back against the narrative that scientists are on the brink of building A.I. technology that will outsmart humanity.

Mr. Chollet’s colorful puzzles test the ability to quickly identify visual patterns based on just a few examples. To play the game, you look closely at the examples and try to find the pattern.

Each example uses the pattern to transform a grid of colored squares into a new grid of colored squares:

The pattern is the same for every example.

Now, fill in the new grid by applying the pattern you learned in the examples above.

For years, these puzzles proved to be nearly impossible for artificial intelligence, including chatbots like ChatGPT.

A.I. systems typically learned their skills by analyzing huge amounts of data culled from across the internet. That meant they could generate sentences by repeating concepts they had seen a thousand times before. But they couldn’t necessarily solve new logic puzzles after seeing only a few examples.

That is, until recently. In December, OpenAI said that its latest A.I. system, called OpenAI o3, had surpassed human performance on Mr. Chollet’s test. Unlike the original version of ChatGPT, o3 was able to spend time considering different possibilities before responding.

Some saw it as proof that A.I. systems were approaching artificial general intelligence, or A.G.I., which describes a machine that’s as smart as a human. Mr. Chollet had created his puzzles as a way of showing that machines were still a long way from this ambitious goal.

But the news also exposed the weaknesses in benchmark tests like ARC, short for Abstraction and Reasoning Corpus. For decades, researchers have set up milestones to track A.I.’s progress. But once these milestones were reached, they were exposed as insufficient measures of true intelligence.

Arvind Narayanan, a Princeton computer science professor and co-author of the book “AI Snake Oil,” said that any claim that the ARC test measured progress toward A.G.I. was “very much iffy.”

Still, Mr. Narayanan acknowledged that OpenAI’s technology demonstrated impressive skills in passing the ARC test. Some of the puzzles are not as easy as the one you just tried.

The one below is little harder, and it, too, was correctly solved by OpenAI’s new A.I. system:

A puzzle like this shows that OpenAI’s technology is getting better at working through logic problems. But the average person can solve puzzles like this one in seconds. OpenAI’s technology consumed significant computing resources to pass the test.

Last June, Mr. Chollet teamed up with Mike Knoop, co-founder of the software company Zapier, to create what they called the ARC Prize. The pair financed a contest that promised $1 million to anyone who built an A.I. system that exceeded human performance on the benchmark, which they renamed “ARC-AGI.”

Companies and researchers submitted over 1,400 A.I. systems, but no one won the prize. All scored below 85 percent, which marked the performance of a “smart” human.

OpenAI’s o3 system correctly answered 87.5 percent of the puzzles. But the company ran afoul of competition rules because it spent nearly $1.5 million in electricity and computing costs to complete the test, according to pricing estimates.

OpenAI was also ineligible for the ARC Prize because it was not willing to publicly share the technology behind its A.I. system through a practice called open sourcing. Separately, OpenAI ran a “high-efficiency” variant of o3 that scored 75.7 percent on the test and cost less than $10,000.

“Intelligence is efficiency. And with these models, they are very far from human-level efficiency,” Mr. Chollet said.

(The New York Times sued OpenAI and its partner, Microsoft, in 2023 for copyright infringement of news content related to A.I. systems.)

On Monday, the ARC Prize introduced a new benchmark, ARC-AGI-2, with hundreds of additional tasks. The puzzles are in the same colorful, grid-like game format as the original benchmark, but are more difficult.

“It’s going to be harder for humans, still very doable,” said Mr. Chollet. “It will be much, much harder for A.I. — o3 is not going to be solving ARC-AGI-2.”

Here is a puzzle from the new ARC-AGI-2 benchmark that OpenAI’s system tried and failed to solve. Remember, the same pattern applies to all the examples.

Now try to fill in the grid below according to the pattern you found in the examples:

This shows that although A.I. systems are better at dealing with problems they have never seen before, they still struggle.

Here are a few additional puzzles from ARC-AGI-2, which focuses on problems that require multiple steps of reasoning:

See solution Play this puzzle

See solution Play this puzzle

See solution Play this puzzle

See solution Play this puzzle

See solution Play this puzzle

As OpenAI and other companies continue to improve their technology, they may pass the new version of ARC. But that does not mean that A.G.I. will be achieved.

Judging intelligence is subjective. There are countless intangible indicators of intelligence, from composing works of art to navigating moral dilemmas to intuiting emotions.

Companies like OpenAI have built chatbots that can answer questions, write poetry and even solve logic puzzles. In some ways, they have already exceeded the powers of the brain. OpenAI’s technology has outperformed its chief scientist, Jakub Pachocki, on a competitive programming test.

But these systems still make mistakes that the average person would never make. And they struggle to do simple things that humans can handle.

“You’re loading the dishwasher, and your dog comes over and starts licking the dishes. What do you do?” said Melanie Mitchell, a professor in A.I. at the Santa Fe Institute. “We sort of know how to do that, because we know all about dogs and dishes and all that. But would a dishwashing robot know how to do that?”

To Mr. Chollet, the ability to efficiently acquire new skills is something that comes naturally to humans but is still lacking in A.I. technology. And it’s what he has been targeting with the ARC-AGI benchmarks.

In January, the ARC Prize became a nonprofit foundation that serves as a “north star for A.G.I.” The ARC Prize team expects ARC-AGI-2 to last for about two years before it is solved by A.I. technology — though they would not be surprised if it happened sooner.

They have already started work on ARC-AGI-3, which they hope to debut in 2026. An early mock-up hints at a puzzle that involves interacting with a dynamic, grid-based game.

A.I. researcher François Chollet designed a puzzle game meant to be easy for humans but hard for machines.

Kelsey McClellan for The New York Times

Early mock-up for ARC-AGI-3, a benchmark that could involve interacting with a dynamic, grid-based game.

ARC Prize Foundation

This is a step closer to what people deal with in the real world — a place filled with movement. It does not stand still like the puzzles you tried above.

Even this, however, will go only part of the way toward showing when machines have surpassed the brain. Humans navigate the physical world — not just the digital. The goal posts will continue to shift as A.I. advances.

“If it’s no longer possible for people like me to produce benchmarks that measure things that are easy for humans but impossible for A.I.,” Mr. Chollet said, “then you have A.G.I.”



Source link

Follow on Google News Follow on Flipboard
Share. Facebook Twitter Pinterest LinkedIn Tumblr Email Copy Link
Previous ArticleAre Nutrient Management Plans Working? Here’s What the Latest Water Quality Data Shows
Next Article Anthropic may soon launch a new Claude 3.7 Sonnet 500K model
Advanced AI Bot
  • Website

Related Posts

OpenAI o3 Breakthrough High Score on ARC-AGI-Pub

April 14, 2025

The Rise of Fluid Intelligence

April 4, 2025

François Chollet Starts an AI Lab Dedicated to AGI

January 16, 2025
Leave A Reply Cancel Reply

Latest Posts

Morning Links for June 3, 2025

Edition Hotels’ Latest Residences Offer Sweeping Views Of Nashville

How The ‘Dine With Dez’ Series Fosters Community For Fashion Creatives

2025 Guide to The Newest, The Coolest And The Craziest Music Festivals

Latest Posts

Stanford HAI’s 2025 AI Index Reveals Record Growth in AI Capabilities, Investment, and Regulation

June 3, 2025

New MIT CSAIL study suggests that AI won’t steal as many jobs as expected

June 3, 2025

Pittsburgh weekly roundup: Axios-OpenAI partnership; Buttigieg visits CMU; AI ‘employees’ in the nonprofit industry

June 3, 2025

Subscribe to News

Subscribe to our newsletter and never miss our latest news

Subscribe my Newsletter for New Posts & tips Let's stay updated!

Welcome to Advanced AI News—your ultimate destination for the latest advancements, insights, and breakthroughs in artificial intelligence.

At Advanced AI News, we are passionate about keeping you informed on the cutting edge of AI technology, from groundbreaking research to emerging startups, expert insights, and real-world applications. Our mission is to deliver high-quality, up-to-date, and insightful content that empowers AI enthusiasts, professionals, and businesses to stay ahead in this fast-evolving field.

Subscribe to Updates

Subscribe to our newsletter and never miss our latest news

Subscribe my Newsletter for New Posts & tips Let's stay updated!

YouTube LinkedIn
  • Home
  • About Us
  • Advertise With Us
  • Contact Us
  • DMCA
  • Privacy Policy
  • Terms & Conditions
© 2025 advancedainews. Designed by advancedainews.

Type above and press Enter to search. Press Esc to cancel.