Close Menu
  • Home
  • AI Models
    • DeepSeek
    • xAI
    • OpenAI
    • Meta AI Llama
    • Google DeepMind
    • Amazon AWS AI
    • Microsoft AI
    • Anthropic (Claude)
    • NVIDIA AI
    • IBM WatsonX Granite 3.1
    • Adobe Sensi
    • Hugging Face
    • Alibaba Cloud (Qwen)
    • Baidu (ERNIE)
    • C3 AI
    • DataRobot
    • Mistral AI
    • Moonshot AI (Kimi)
    • Google Gemma
    • xAI
    • Stability AI
    • H20.ai
  • AI Research
    • Allen Institue for AI
    • arXiv AI
    • Berkeley AI Research
    • CMU AI
    • Google Research
    • Microsoft Research
    • Meta AI Research
    • OpenAI Research
    • Stanford HAI
    • MIT CSAIL
    • Harvard AI
  • AI Funding & Startups
    • AI Funding Database
    • CBInsights AI
    • Crunchbase AI
    • Data Robot Blog
    • TechCrunch AI
    • VentureBeat AI
    • The Information AI
    • Sifted AI
    • WIRED AI
    • Fortune AI
    • PitchBook
    • TechRepublic
    • SiliconANGLE – Big Data
    • MIT News
    • Data Robot Blog
  • Expert Insights & Videos
    • Google DeepMind
    • Lex Fridman
    • Matt Wolfe AI
    • Yannic Kilcher
    • Two Minute Papers
    • AI Explained
    • TheAIEdge
    • Matt Wolfe AI
    • The TechLead
    • Andrew Ng
    • OpenAI
  • Expert Blogs
    • François Chollet
    • Gary Marcus
    • IBM
    • Jack Clark
    • Jeremy Howard
    • Melanie Mitchell
    • Andrew Ng
    • Andrej Karpathy
    • Sebastian Ruder
    • Rachel Thomas
    • IBM
  • AI Policy & Ethics
    • ACLU AI
    • AI Now Institute
    • Center for AI Safety
    • EFF AI
    • European Commission AI
    • Partnership on AI
    • Stanford HAI Policy
    • Mozilla Foundation AI
    • Future of Life Institute
    • Center for AI Safety
    • World Economic Forum AI
  • AI Tools & Product Releases
    • AI Assistants
    • AI for Recruitment
    • AI Search
    • Coding Assistants
    • Customer Service AI
    • Image Generation
    • Video Generation
    • Writing Tools
    • AI for Recruitment
    • Voice/Audio Generation
  • Industry Applications
    • Finance AI
    • Healthcare AI
    • Legal AI
    • Manufacturing AI
    • Media & Entertainment
    • Transportation AI
    • Education AI
    • Retail AI
    • Agriculture AI
    • Energy AI
  • AI Art & Entertainment
    • AI Art News Blog
    • Artvy Blog » AI Art Blog
    • Weird Wonderful AI Art Blog
    • The Chainsaw » AI Art
    • Artvy Blog » AI Art Blog
What's Hot

IBM Wants Its Managers to Get Back to the Office or Get Out

OpenAI Admits AI Hallucinations Are a Fundamental Flaw, Not an Engineering Fix

Vice chairman of IBM on the Federal Reserve lowering interest rates and job numbers

Facebook X (Twitter) Instagram
Advanced AI News
  • Home
  • AI Models
    • OpenAI (GPT-4 / GPT-4o)
    • Anthropic (Claude 3)
    • Google DeepMind (Gemini)
    • Meta (LLaMA)
    • Cohere (Command R)
    • Amazon (Titan)
    • IBM (Watsonx)
    • Inflection AI (Pi)
  • AI Research
    • Allen Institue for AI
    • arXiv AI
    • Berkeley AI Research
    • CMU AI
    • Google Research
    • Meta AI Research
    • Microsoft Research
    • OpenAI Research
    • Stanford HAI
    • MIT CSAIL
    • Harvard AI
  • AI Funding
    • AI Funding Database
    • CBInsights AI
    • Crunchbase AI
    • Data Robot Blog
    • TechCrunch AI
    • VentureBeat AI
    • The Information AI
    • Sifted AI
    • WIRED AI
    • Fortune AI
    • PitchBook
    • TechRepublic
    • SiliconANGLE – Big Data
    • MIT News
    • Data Robot Blog
  • AI Experts
    • Google DeepMind
    • Lex Fridman
    • Meta AI Llama
    • Yannic Kilcher
    • Two Minute Papers
    • AI Explained
    • TheAIEdge
    • The TechLead
    • Matt Wolfe AI
    • Andrew Ng
    • OpenAI
    • Expert Blogs
      • François Chollet
      • Gary Marcus
      • IBM
      • Jack Clark
      • Jeremy Howard
      • Melanie Mitchell
      • Andrew Ng
      • Andrej Karpathy
      • Sebastian Ruder
      • Rachel Thomas
      • IBM
  • AI Tools
    • AI Assistants
    • AI for Recruitment
    • AI Search
    • Coding Assistants
    • Customer Service AI
  • AI Policy
    • ACLU AI
    • AI Now Institute
    • Center for AI Safety
  • Business AI
    • Advanced AI News Features
    • Finance AI
    • Healthcare AI
    • Education AI
    • Energy AI
    • Legal AI
LinkedIn Instagram YouTube Threads X (Twitter)
Advanced AI News
OpenAI

OpenAI Admits AI Hallucinations Are a Fundamental Flaw, Not an Engineering Fix

By Advanced AI EditorSeptember 21, 2025No Comments4 Mins Read
Share Facebook Twitter Pinterest Copy Link Telegram LinkedIn Tumblr Email
Share
Facebook Twitter LinkedIn Pinterest Email


OpenAI just dropped some uncomfortable news about artificial intelligence: no matter how much we improve these systems, they’ll always hallucinate. That means ChatGPT, Claude, and other AI chatbots will keep making up plausible-sounding information that’s completely wrong.

This isn’t coming from AI critics or skeptics. OpenAI researchers themselves published this study on September 4, essentially admitting that the technology powering their wildly popular ChatGPT has built-in flaws that can’t be fixed with better engineering.

OpenAI Research Reveals a Fundamental Flaw in LLMs

The research team, led by OpenAI’s Adam Tauman Kalai, Edwin Zhang, and Ofir Nachum, along with Georgia Tech’s Santosh S. Vempala, created a mathematical framework that proves why AI systems must generate false information. They compared it to students guessing on difficult exam questions instead of admitting they don’t know the answer.

“Like students facing hard exam questions, large language models sometimes guess when uncertain, producing plausible yet incorrect statements instead of admitting uncertainty,” the researchers explained.

Here’s the kicker: even when trained on perfect data, these systems will still hallucinate. The study shows that AI’s “generative error rate is at least twice the misclassification rate,” meaning there are mathematical limits that no amount of technological advancement can overcome.

The researchers tested this theory on current top-tier models. When they asked “How many Ds are in DEEPSEEK?” the DeepSeek-V3 model with 600 billion parameters gave answers ranging from 2 to 3 across ten trials. The correct answer is 4. Meta AI and Claude made similar mistakes, with some responses as wildly off as 6 or 7.

OpenAI Admits AI Hallucinations Are a Fundamental Flaw, Not an Engineering Fix
Credits: Computerworld

Even more concerning, OpenAI’s own advanced reasoning models performed worse than simpler systems. Their o1 model hallucinated 16% of the time when summarizing public information. The newer o3 and o4-mini models were even less reliable, hallucinating 33% and 48% of the time respectively.

The Three Core Reasons for AI Hallucinations

The study identified three core reasons why hallucinations are unavoidable:

First, there’s “epistemic uncertainty” – when information rarely appeared in the training data, the AI simply doesn’t have enough examples to learn from. Second, current AI architectures have fundamental limitations in what they can represent. Third, some problems are computationally impossible to solve, even for hypothetically superintelligent systems.

Neil Shah from Counterpoint Technologies put it bluntly: “Unlike human intelligence, it lacks the humility to acknowledge uncertainty. When unsure, it doesn’t defer to deeper research or human oversight; instead, it often presents estimates as facts.”

The research revealed something troubling about how we evaluate AI systems. Nine out of ten major AI benchmarks actually encourage hallucinations by punishing models that say “I don’t know” while rewarding confident wrong answers.

This creates a perverse incentive where AI systems learn to always give an answer, even when they’re uncertain. The researchers argue that “language models hallucinate because the training and evaluation procedures reward guessing over acknowledging uncertainty.”

Why Flawless Models Are Impossible?

For companies already using AI, this research demands a complete strategy overhaul. Charlie Dai from Forrester notes that enterprises are “increasingly struggling with model quality challenges in production, especially in regulated sectors like finance and healthcare.”

The solution isn’t trying to eliminate hallucinations – that’s mathematically impossible. Instead, businesses need to shift from prevention to risk management. This means implementing stronger human oversight, creating domain-specific safety measures, and continuously monitoring AI outputs.

Dai recommends that companies “prioritize calibrated confidence and transparency over raw benchmark scores” when choosing AI vendors. Look for systems that provide uncertainty estimates and have been tested in real-world scenarios, not just laboratory benchmarks.

Shah suggests the industry needs evaluation standards similar to automotive safety ratings, dynamic grades that reflect each model’s reliability and risk profile. The current approach of treating all AI outputs as equally trustworthy clearly isn’t working.

The message for anyone using AI is clear: these systems will always make mistakes. The key is building processes that account for this reality rather than hoping the technology will eventually become perfect. As the OpenAI researchers concluded, some level of unreliability will persist regardless of technical improvements.



Source link

Follow on Google News Follow on Flipboard
Share. Facebook Twitter Pinterest LinkedIn Tumblr Email Copy Link
Previous ArticleVice chairman of IBM on the Federal Reserve lowering interest rates and job numbers
Next Article IBM Wants Its Managers to Get Back to the Office or Get Out
Advanced AI Editor
  • Website

Related Posts

OpenAI Recruits Apple Veterans and Suppliers to Build Next-Gen AI Devices for 2026

September 21, 2025

Nvidia, OpenAI plan major UK AI infrastructure expansion

September 21, 2025

OpenAI Augmenting ChatGPT With An Online Network Of Human Therapists Will Skyrocket The Need For Mental Health Professionals

September 21, 2025

Comments are closed.

Latest Posts

Who Are the Art World Figures on the Time 100 List?

Acquavella Signs Harumi Klossowska de Rola, Daughter of Balthus

Heirs of Jewish Collector Urge Court to Reconsider Claim to Sunflowers

Art World Figures Remember Agnes Gund: ‘a Legend and Icon’

Latest Posts

IBM Wants Its Managers to Get Back to the Office or Get Out

September 21, 2025

OpenAI Admits AI Hallucinations Are a Fundamental Flaw, Not an Engineering Fix

September 21, 2025

Vice chairman of IBM on the Federal Reserve lowering interest rates and job numbers

September 21, 2025

Subscribe to News

Subscribe to our newsletter and never miss our latest news

Subscribe my Newsletter for New Posts & tips Let's stay updated!

Recent Posts

  • IBM Wants Its Managers to Get Back to the Office or Get Out
  • OpenAI Admits AI Hallucinations Are a Fundamental Flaw, Not an Engineering Fix
  • Vice chairman of IBM on the Federal Reserve lowering interest rates and job numbers
  • Oracle (ORCL) Stock: Jumps on Potential $20 Billion Meta AI Cloud Partnership
  • Why a Tesla Model Y led to a teen’s failed driving exam in Ontario

Recent Comments

  1. BrentRounk on 1-800-CHAT-GPT—12 Days of OpenAI: Day 10
  2. BenitoGam on 1-800-CHAT-GPT—12 Days of OpenAI: Day 10
  3. JeffreyDaf on 1-800-CHAT-GPT—12 Days of OpenAI: Day 10
  4. ThomasPal on 1-800-CHAT-GPT—12 Days of OpenAI: Day 10
  5. HowardLut on 1-800-CHAT-GPT—12 Days of OpenAI: Day 10

Welcome to Advanced AI News—your ultimate destination for the latest advancements, insights, and breakthroughs in artificial intelligence.

At Advanced AI News, we are passionate about keeping you informed on the cutting edge of AI technology, from groundbreaking research to emerging startups, expert insights, and real-world applications. Our mission is to deliver high-quality, up-to-date, and insightful content that empowers AI enthusiasts, professionals, and businesses to stay ahead in this fast-evolving field.

Subscribe to Updates

Subscribe to our newsletter and never miss our latest news

Subscribe my Newsletter for New Posts & tips Let's stay updated!

LinkedIn Instagram YouTube Threads X (Twitter)
  • Home
  • About Us
  • Advertise With Us
  • Contact Us
  • DMCA
  • Privacy Policy
  • Terms & Conditions
© 2025 advancedainews. Designed by advancedainews.

Type above and press Enter to search. Press Esc to cancel.