OpenAI Gets Caught Vibe Graphing

During its big GPT-5 livestream on Thursday, OpenAI showed off a few charts that made the model seem quite impressive — but if you look closely, some graphs were a little bit off.

In one, ironically showing how well GPT-5 does in “deception evals across models,” the scale is all over the place. For “coding deception,” for example, the chart shown onstage says GPT-5 with thinking apparently gets a 50.0 percent deception rate, but that’s compared to OpenAI’s smaller 47.4 percent o3 score which somehow has a larger bar. OpenAI appears to have accurate numbers for this chart in its GPT-5 blog post, however, where GPT-5’s deception rate is labeled as 16.5 percent.

With this chart, OpenAI showed onstage that one of GPT-5’s scores is lower than o3’s but is shown with a bigger bar. In this same chart, o3 and GPT-4o’s scores are different but shown with equally-sized bars. It was bad enough that CEO Sam Altman commented on it, calling it a “mega chart screwup,” though he noted that a correct version is in OpenAI’s blog post.

An OpenAI marketing staffer also apologized, saying, “We fixed the chart in the blog guys, apologies for the unintentional chart crime.”

OpenAI didn’t immediately respond to a request for comment. And while it’s unclear if OpenAI used GPT-5 to actually make the charts, it’s still not a great look for the company on its big launch day — especially when it is touting the “significant advances in reducing hallucinations” with its new model.

Source link

What's Hot

Who is Chet Kapoor? Steve Jobs’ intern, who sold his companies to IBM, Google – Trending News

Why automated customer service is bad for customer loyalty

The Future For In-house Legal – Artificial Lawyer

OpenAI gets caught vibe graphing

OPM adds OpenAI to its employees’ computers

Nvidia to invest $100 billion in OpenAI to help expand computing power

Nvidia to invest $100 billion in OpenAI to help expand the ChatGPT maker’s computing power

Court Rules ‘Gender Ideology’ Ban on Art Endowments Unconstitutional

Rural Danish Art Museum Acquires Painting By Artemisia Gentileschi

St. Patrick’s Cathedral Unveils Monumental Mural by Adam Cvijanovic

Three Loaned Banksy Works Incite Dispute Between England and Italy

Who is Chet Kapoor? Steve Jobs’ intern, who sold his companies to IBM, Google – Trending News

Why automated customer service is bad for customer loyalty

The Future For In-house Legal – Artificial Lawyer

What's Hot

OpenAI gets caught vibe graphing

Related Posts

Subscribe to Updates