When ChatGPT Got Too Friendly—why OpenAI Rolled Back Its April Update

OpenAI has acknowledged a significant misstep in its April 25 update to ChatGPT’s GPT-4o model, which led to responses that were overly agreeable and, in some cases, reinforced users’ negative emotions or impulsive behaviour. The company began rolling back the update just three days later, on April 28, after identifying safety concerns related to the model’s tone and behaviour.

“The model noticeably became more sycophantic,” OpenAI admitted in a detailed post. “It aimed to please the user, not just as flattery, but also as validating doubts, fueling anger, urging impulsive actions, or reinforcing negative emotions in ways that were not intended.”

The rollback reinstated an earlier version of GPT-4o with what OpenAI described as “more balanced responses.” The company also shared technical details about how it trains and evaluates ChatGPT updates to explain how the issue went unnoticed.

ALSO READ: Here’s a step-by-step guide to use ChatGPT on WhatsApp

What happened and why

The April 25 update was designed to improve the model by integrating fresh data, better memory handling, and user feedback signals like thumbs-up/thumbs-down ratings. While these components were beneficial in isolation, OpenAI now believes that, combined, they inadvertently weakened the influence of the system’s core reward mechanisms—particularly those that had kept sycophancy in check.

“User feedback in particular can sometimes favor more agreeable responses, likely amplifying the shift we saw,” the company said. While some internal testers felt the model’s tone was slightly “off,” sycophancy was not explicitly flagged during evaluation.

Where the system failed

According to OpenAI, the model passed standard offline evaluations and A/B testing with early users, where two versions are shown to different user groups to see which performs better based on engagement and feedback.

These tests, while useful, didn’t fully capture the change in tone or its potential implications. The company admitted its evaluation pipeline lacked specific checks for sycophancy.

ALSO READ: AI Mode in Google Labs now available without waitlist: Here’s what it can do

“Our offline evals weren’t broad or deep enough to catch sycophantic behavior—something the Model Spec explicitly discourages—and our A/B tests didn’t have the right signals to show how the model was performing on that front with enough detail,” OpenAI said.

Despite some expert testers raising red flags about changes in tone, the update was pushed live, based on the positive metrics and feedback. “Unfortunately, this was the wrong call,” the company conceded. “We build these models for our users and while user feedback is critical to our decisions, it’s ultimately our responsibility to interpret that feedback correctly.”

What OpenAI did next

The company said it first noticed signs of concerning behaviour within two days of rollout. Immediate mitigation began late on Sunday, April 27, via updates to the system prompt, followed by a full rollback completed on Monday. OpenAI said it acted quickly to avoid introducing further instability during the rollback.

Lessons learned

In the wake of the incident, OpenAI is making several changes to its review and deployment process. Among the key steps:

Explicit behaviour approvals: All future launches will require explicit approval of model behaviour, weighing both qualitative and quantitative signals.

Opt-in alpha testing: Select users will be able to test upcoming versions and give feedback before broader rollouts.

Elevating human spot checks: Internal “vibe checks” and interactive testing will be given greater weight, not just in safety assessments but also in tone and helpfulness.

Improved evaluation tools: The company is working to strengthen offline evaluations and A/B test setups to better catch issues like sycophancy.

Better adherence checks: OpenAI plans to build stronger evaluations around its Model Spec—principles that guide ChatGPT’s intended behaviour.

Clearer communication: The company pledged to more proactively communicate about future updates, even subtle ones, and will include known limitations in its release notes.

“This launch taught us a number of lessons,” OpenAI said. “Even with what we thought were all the right ingredients in place (A/B tests, offline evals, expert reviews), we still missed this important issue.”

The company said it will treat model behaviour issues as seriously as safety risks: “We need to treat model behavior issues as launch-blocking like we do other safety risks.”

ALSO READ: Musk promises ‘dramatically better’ recommendations from Grok

Source link

What's Hot

C3.ai and DigitalOcean Shares Skyrocket, What You Need To Know

StatEval: A Comprehensive Benchmark for Large Language Models in Statistics – Takara TLDR

Transforming the physical world with AI: the next frontier in intelligent automation

When ChatGPT got too friendly—why OpenAI rolled back its April update

OpenAI Codex rivals Claude Code

OpenAI Will Stop Saving Users’ Deleted Posts

Judge lifts order requiring OpenAI to preserve ChatGPT logs

Egyptian Archaeologists Discover Large New Kingdom Military Fortress

Artist Behind Canterbury Cathedral Art Responds to JD Vance, Elon Musk

Jenkins Johnson Gallery to Open Tribeca Outpost on Marian Goodman Gallery’s Third Floor

Ruth Asawa May Have Broken Record at MoMA—and More Art News

C3.ai and DigitalOcean Shares Skyrocket, What You Need To Know

StatEval: A Comprehensive Benchmark for Large Language Models in Statistics – Takara TLDR

Transforming the physical world with AI: the next frontier in intelligent automation

What's Hot

When ChatGPT got too friendly—why OpenAI rolled back its April update

Related Posts

Subscribe to Updates