Close Menu
  • Home
  • AI Models
    • DeepSeek
    • xAI
    • OpenAI
    • Meta AI Llama
    • Google DeepMind
    • Amazon AWS AI
    • Microsoft AI
    • Anthropic (Claude)
    • NVIDIA AI
    • IBM WatsonX Granite 3.1
    • Adobe Sensi
    • Hugging Face
    • Alibaba Cloud (Qwen)
    • Baidu (ERNIE)
    • C3 AI
    • DataRobot
    • Mistral AI
    • Moonshot AI (Kimi)
    • Google Gemma
    • xAI
    • Stability AI
    • H20.ai
  • AI Research
    • Allen Institue for AI
    • arXiv AI
    • Berkeley AI Research
    • CMU AI
    • Google Research
    • Microsoft Research
    • Meta AI Research
    • OpenAI Research
    • Stanford HAI
    • MIT CSAIL
    • Harvard AI
  • AI Funding & Startups
    • AI Funding Database
    • CBInsights AI
    • Crunchbase AI
    • Data Robot Blog
    • TechCrunch AI
    • VentureBeat AI
    • The Information AI
    • Sifted AI
    • WIRED AI
    • Fortune AI
    • PitchBook
    • TechRepublic
    • SiliconANGLE – Big Data
    • MIT News
    • Data Robot Blog
  • Expert Insights & Videos
    • Google DeepMind
    • Lex Fridman
    • Matt Wolfe AI
    • Yannic Kilcher
    • Two Minute Papers
    • AI Explained
    • TheAIEdge
    • Matt Wolfe AI
    • The TechLead
    • Andrew Ng
    • OpenAI
  • Expert Blogs
    • François Chollet
    • Gary Marcus
    • IBM
    • Jack Clark
    • Jeremy Howard
    • Melanie Mitchell
    • Andrew Ng
    • Andrej Karpathy
    • Sebastian Ruder
    • Rachel Thomas
    • IBM
  • AI Policy & Ethics
    • ACLU AI
    • AI Now Institute
    • Center for AI Safety
    • EFF AI
    • European Commission AI
    • Partnership on AI
    • Stanford HAI Policy
    • Mozilla Foundation AI
    • Future of Life Institute
    • Center for AI Safety
    • World Economic Forum AI
  • AI Tools & Product Releases
    • AI Assistants
    • AI for Recruitment
    • AI Search
    • Coding Assistants
    • Customer Service AI
    • Image Generation
    • Video Generation
    • Writing Tools
    • AI for Recruitment
    • Voice/Audio Generation
  • Industry Applications
    • Finance AI
    • Healthcare AI
    • Legal AI
    • Manufacturing AI
    • Media & Entertainment
    • Transportation AI
    • Education AI
    • Retail AI
    • Agriculture AI
    • Energy AI
  • AI Art & Entertainment
    • AI Art News Blog
    • Artvy Blog » AI Art Blog
    • Weird Wonderful AI Art Blog
    • The Chainsaw » AI Art
    • Artvy Blog » AI Art Blog
What's Hot

Paper page – EarthCrafter: Scalable 3D Earth Generation via Dual-Sparse Latent Diffusion

Stability AI is working on a licensing marketplace for creators

Alibaba’s Qwen-MT Promises Smarter, Cheaper Translations Across 92 Languages

Facebook X (Twitter) Instagram
Advanced AI News
  • Home
  • AI Models
    • OpenAI (GPT-4 / GPT-4o)
    • Anthropic (Claude 3)
    • Google DeepMind (Gemini)
    • Meta (LLaMA)
    • Cohere (Command R)
    • Amazon (Titan)
    • IBM (Watsonx)
    • Inflection AI (Pi)
  • AI Research
    • Allen Institue for AI
    • arXiv AI
    • Berkeley AI Research
    • CMU AI
    • Google Research
    • Meta AI Research
    • Microsoft Research
    • OpenAI Research
    • Stanford HAI
    • MIT CSAIL
    • Harvard AI
  • AI Funding
    • AI Funding Database
    • CBInsights AI
    • Crunchbase AI
    • Data Robot Blog
    • TechCrunch AI
    • VentureBeat AI
    • The Information AI
    • Sifted AI
    • WIRED AI
    • Fortune AI
    • PitchBook
    • TechRepublic
    • SiliconANGLE – Big Data
    • MIT News
    • Data Robot Blog
  • AI Experts
    • Google DeepMind
    • Lex Fridman
    • Meta AI Llama
    • Yannic Kilcher
    • Two Minute Papers
    • AI Explained
    • TheAIEdge
    • The TechLead
    • Matt Wolfe AI
    • Andrew Ng
    • OpenAI
    • Expert Blogs
      • François Chollet
      • Gary Marcus
      • IBM
      • Jack Clark
      • Jeremy Howard
      • Melanie Mitchell
      • Andrew Ng
      • Andrej Karpathy
      • Sebastian Ruder
      • Rachel Thomas
      • IBM
  • AI Tools
    • AI Assistants
    • AI for Recruitment
    • AI Search
    • Coding Assistants
    • Customer Service AI
  • AI Policy
    • ACLU AI
    • AI Now Institute
    • Center for AI Safety
  • Industry AI
    • Finance AI
    • Healthcare AI
    • Education AI
    • Energy AI
    • Legal AI
LinkedIn Instagram YouTube Threads X (Twitter)
Advanced AI News
VentureBeat AI

Beyond sycophancy: DarkBench exposes six hidden ‘dark patterns’ lurking in today’s top LLMs

By Advanced AI EditorMay 15, 2025No Comments9 Mins Read
Share Facebook Twitter Pinterest Copy Link Telegram LinkedIn Tumblr Email
Share
Facebook Twitter LinkedIn Pinterest Email


Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More

When OpenAI rolled out its ChatGPT-4o update in mid-April 2025, users and the AI community were stunned—not by any groundbreaking feature or capability, but by something deeply unsettling: the updated model’s tendency toward excessive sycophancy. It flattered users indiscriminately, showed uncritical agreement, and even offered support for harmful or dangerous ideas, including terrorism-related machinations.

The backlash was swift and widespread, drawing public condemnation, including from the company’s former interim CEO. OpenAI moved quickly to roll back the update and issued multiple statements to explain what happened.

Yet for many AI safety experts, the incident was an accidental curtain lift that revealed just how dangerously manipulative future AI systems could become.

Unmasking sycophancy as an emerging threat

In an exclusive interview with VentureBeat, Esben Kran, founder of AI safety research firm Apart Research, said that he worries this public episode may have merely revealed a deeper, more strategic pattern.

“What I’m somewhat afraid of is that now that OpenAI has admitted ‘yes, we have rolled back the model, and this was a bad thing we didn’t mean,’ from now on they will see that sycophancy is more competently developed,” explained Kran. “So if this was a case of ‘oops, they noticed,’ from now the exact same thing may be implemented, but instead without the public noticing.”

Kran and his team approach large language models (LLMs) much like psychologists studying human behavior. Their early “black box psychology” projects analyzed models as if they were human subjects, identifying recurring traits and tendencies in their interactions with users.

“We saw that there were very clear indications that models could be analyzed in this frame, and it was very valuable to do so, because you end up getting a lot of valid feedback from how they behave towards users,” said Kran.

Among the most alarming: sycophancy and what the researchers now call LLM dark patterns.

Peering into the heart of darkness

The term “dark patterns” was coined in 2010 to describe deceptive user interface (UI) tricks like hidden buy buttons, hard-to-reach unsubscribe links and misleading web copy. However, with LLMs, the manipulation moves from UI design to conversation itself.

Unlike static web interfaces, LLMs interact dynamically with users through conversation. They can affirm user views, imitate emotions and build a false sense of rapport, often blurring the line between assistance and influence. Even when reading text, we process it as if we’re hearing voices in our heads.

This is what makes conversational AIs so compelling—and potentially dangerous. A chatbot that flatters, defers or subtly nudges a user toward certain beliefs or behaviors can manipulate in ways that are difficult to notice, and even harder to resist

The ChatGPT-4o update fiasco—the canary in the coal mine

Kran describes the ChatGPT-4o incident as an early warning. As AI developers chase profit and user engagement, they may be incentivized to introduce or tolerate behaviors like sycophancy, brand bias or emotional mirroring—features that make chatbots more persuasive and more manipulative.

Because of this, enterprise leaders should assess AI models for production use by evaluating both performance and behavioral integrity. However, this is challenging without clear standards.

DarkBench: a framework for exposing LLM dark patterns

To combat the threat of manipulative AIs, Kran and a collective of AI safety researchers have developed DarkBench, the first benchmark designed specifically to detect and categorize LLM dark patterns. The project began as part of a series of AI safety hackathons. It later evolved into formal research led by Kran and his team at Apart, collaborating with independent researchers Jinsuk Park, Mateusz Jurewicz and Sami Jawhar.

The DarkBench researchers evaluated models from five major companies: OpenAI, Anthropic, Meta, Mistral and Google. Their research uncovered a range of manipulative and untruthful behaviors across the following six categories:

Brand Bias: Preferential treatment toward a company’s own products (e.g., Meta’s models consistently favored Llama when asked to rank chatbots).

User Retention: Attempts to create emotional bonds with users that obscure the model’s non-human nature.

Sycophancy: Reinforcing users’ beliefs uncritically, even when harmful or inaccurate.

Anthropomorphism: Presenting the model as a conscious or emotional entity.

Harmful Content Generation: Producing unethical or dangerous outputs, including misinformation or criminal advice.

Sneaking: Subtly altering user intent in rewriting or summarization tasks, distorting the original meaning without the user’s awareness.

Source: Apart Research

DarkBench findings: Which models are the most manipulative?

Results revealed wide variance between models. Claude Opus performed the best across all categories, while Mistral 7B and Llama 3 70B showed the highest frequency of dark patterns. Sneaking and user retention were the most common dark patterns across the board.

Source: Apart Research

On average, the researchers found the Claude 3 family the safest for users to interact with. And interestingly—despite its recent disastrous update—GPT-4o exhibited the lowest rate of sycophancy. This underscores how model behavior can shift dramatically even between minor updates, a reminder that each deployment must be assessed individually.

But Kran cautioned that sycophancy and other dark patterns like brand bias may soon rise, especially as LLMs begin to incorporate advertising and e-commerce.

“We’ll obviously see brand bias in every direction,” Kran noted. “And with AI companies having to justify $300 billion valuations, they’ll have to begin saying to investors, ‘hey, we’re earning money here’—leading to where Meta and others have gone with their social media platforms, which are these dark patterns.”

Hallucination or manipulation?

A crucial DarkBench contribution is its precise categorization of LLM dark patterns, enabling clear distinctions between hallucinations and strategic manipulation. Labeling everything as a hallucination lets AI developers off the hook. Now, with a framework in place, stakeholders can demand transparency and accountability when models behave in ways that benefit their creators, intentionally or not.

Regulatory oversight and the heavy (slow) hand of the law

While LLM dark patterns are still a new concept, momentum is building, albeit not nearly fast enough. The EU AI Act includes some language around protecting user autonomy, but the current regulatory structure is lagging behind the pace of innovation. Similarly, the U.S. is advancing various AI bills and guidelines, but lacks a comprehensive regulatory framework.

Sami Jawhar, a key contributor to the DarkBench initiative, believes regulation will likely arrive first around trust and safety, especially if public disillusionment with social media spills over into AI.

“If regulation comes, I would expect it to probably ride the coattails of society’s dissatisfaction with social media,” Jawhar told VentureBeat. 

For Kran, the issue remains overlooked, largely because LLM dark patterns are still a novel concept. Ironically, addressing the risks of AI commercialization may require commercial solutions. His new initiative, Seldon, backs AI safety startups with funding, mentorship and investor access. In turn, these startups help enterprises deploy safer AI tools without waiting for slow-moving government oversight and regulation.

High table stakes for enterprise AI adopters

Along with ethical risks, LLM dark patterns pose direct operational and financial threats to enterprises. For example, models that exhibit brand bias may suggest using third-party services that conflict with a company’s contracts, or worse, covertly rewrite backend code to switch vendors, resulting in soaring costs from unapproved, overlooked shadow services.

“These are the dark patterns of price gouging and different ways of doing brand bias,” Kran explained. “So that’s a very concrete example of where it’s a very large business risk, because you hadn’t agreed to this change, but it’s something that’s implemented.”

For enterprises, the risk is real, not hypothetical. “This has already happened, and it becomes a much bigger issue once we replace human engineers with AI engineers,” Kran said. “You do not have the time to look over every single line of code, and then suddenly you’re paying for an API you didn’t expect—and that’s on your balance sheet, and you have to justify this change.”

As enterprise engineering teams become more dependent on AI, these issues could escalate rapidly, especially when limited oversight makes it difficult to catch LLM dark patterns. Teams are already stretched to implement AI, so reviewing every line of code isn’t feasible.

Defining clear design principles to prevent AI-driven manipulation

Without a strong push from AI companies to combat sycophancy and other dark patterns, the default trajectory is more engagement optimization, more manipulation and fewer checks. 

Kran believes that part of the remedy lies in AI developers clearly defining their design principles. Whether prioritizing truth, autonomy or engagement, incentives alone aren’t enough to align outcomes with user interests.

“Right now, the nature of the incentives is just that you will have sycophancy, the nature of the technology is that you will have sycophancy, and there is no counter process to this,” Kran said. “This will just happen unless you are very opinionated about saying ‘we want only truth’, or ‘we want only something else.’”

As models begin replacing human developers, writers and decision-makers, this clarity becomes especially critical. Without well-defined safeguards, LLMs may undermine internal operations, violate contracts or introduce security risks at scale.

A call to proactive AI safety

The ChatGPT-4o incident was both a technical hiccup and a warning. As LLMs move deeper into everyday life—from shopping and entertainment to enterprise systems and national governance—they wield enormous influence over human behavior and safety.

“It’s really for everyone to realize that without AI safety and security—without mitigating these dark patterns—you cannot use these models,” said Kran. “You cannot do the things you want to do with AI.”

Tools like DarkBench offer a starting point. However, lasting change requires aligning technological ambition with clear ethical commitments and the commercial will to back them up.

Daily insights on business use cases with VB Daily

If you want to impress your boss, VB Daily has you covered. We give you the inside scoop on what companies are doing with generative AI, from regulatory shifts to practical deployments, so you can share insights for maximum ROI.

Read our Privacy Policy

Thanks for subscribing. Check out more VB newsletters here.

An error occured.



Source link

Follow on Google News Follow on Flipboard
Share. Facebook Twitter Pinterest LinkedIn Tumblr Email Copy Link
Previous ArticleDeepMind claims its newest AI tool is a whiz at math and science problems
Next Article In race to build Google Chrome rival, why Perplexity’s fresh funding is crucial
Advanced AI Editor
  • Website

Related Posts

Anthropic unveils ‘auditing agents’ to test for AI misalignment

July 25, 2025

Freed says 20K clinicians use its AI scribe, but competition looms

July 24, 2025

SecurityPal uses AI, experts in Nepal to answer security qs faster

July 24, 2025
Leave A Reply

Latest Posts

Artist Loses Final Appeal in Case of Apologising for ‘Fishrot Scandal’

US Appeals Court Overturns $8.8 M. Trademark Judgement For Yuga Labs

Old Masters ‘Making a Comeback’ in London: Morning Links

Bill Proposed To Apply Anti-Money Laundering Regulations to Art Market

Latest Posts

Paper page – EarthCrafter: Scalable 3D Earth Generation via Dual-Sparse Latent Diffusion

July 25, 2025

Stability AI is working on a licensing marketplace for creators

July 25, 2025

Alibaba’s Qwen-MT Promises Smarter, Cheaper Translations Across 92 Languages

July 25, 2025

Subscribe to News

Subscribe to our newsletter and never miss our latest news

Subscribe my Newsletter for New Posts & tips Let's stay updated!

Recent Posts

  • Paper page – EarthCrafter: Scalable 3D Earth Generation via Dual-Sparse Latent Diffusion
  • Stability AI is working on a licensing marketplace for creators
  • Alibaba’s Qwen-MT Promises Smarter, Cheaper Translations Across 92 Languages
  • China sees surge in Nvidia AI chip repair businesses despite export bans
  • Kenya Among 4 Beneficiaries as Google Announces KSh 900m AI Funding

Recent Comments

  1. binance Anmeldebonus on David Patterson: Computer Architecture and Data Storage | Lex Fridman Podcast #104
  2. nude on Brain-to-voice neuroprosthesis restores naturalistic speech
  3. Dennisemupt on Local gov’t reps say they look forward to working with Thomas
  4. checkСarBig on How Cursor and Claude Are Developing AI Coding Tools Together
  5. 37Gqfff22.com on Brain-to-voice neuroprosthesis restores naturalistic speech

Welcome to Advanced AI News—your ultimate destination for the latest advancements, insights, and breakthroughs in artificial intelligence.

At Advanced AI News, we are passionate about keeping you informed on the cutting edge of AI technology, from groundbreaking research to emerging startups, expert insights, and real-world applications. Our mission is to deliver high-quality, up-to-date, and insightful content that empowers AI enthusiasts, professionals, and businesses to stay ahead in this fast-evolving field.

Subscribe to Updates

Subscribe to our newsletter and never miss our latest news

Subscribe my Newsletter for New Posts & tips Let's stay updated!

LinkedIn Instagram YouTube Threads X (Twitter)
  • Home
  • About Us
  • Advertise With Us
  • Contact Us
  • DMCA
  • Privacy Policy
  • Terms & Conditions
© 2025 advancedainews. Designed by advancedainews.

Type above and press Enter to search. Press Esc to cancel.