Anthropic’s Claude AI Will Now Cut Off Abusive Chats With Users For Its Own ‘welfare’

Amazon-backed Anthropic has said its most capable Artificial Intelligence (AI) models, Claude Opus 4 and 4.1, will now exit a conversation with a user if they are being abusive or persistently harmful in their interactions.

The move is aimed at improving the ‘welfare’ of AI systems in potentially distressing situations, the company said in a blog post on Friday, August 15. “We’re treating this feature as an ongoing experiment and will continue refining our approach,” it said.

If Claude ends a conversation on its end, users can either edit and re-submit their previous prompt or start a new chat. They can also give feedback by reacting to Claude’s message with thumbs up/down, or using the dedicated ‘Give feedback’ button.

Story continues below this ad

Claude will also not be able to end chats on its own in cases where users might be at imminent risk of harming themselves or others, Anthropic further stated.

The new feature, developed as part of Anthropic’s exploratory work on AI welfare, comes amid the emerging trend of users turning to AI chatbots like Claude or ChatGPT for low-cost therapy and professional advice. However, a recent study found that AI chatbots showed signs of stress and anxiety when users shared “traumatic narratives” about crime, war, or car accidents. This could potentially make the chatbots less useful in therapeutic settings with people.

Beyond AI welfare, Anthropic said Claude’s ability to end chats also has broader relevance to model alignment and safeguards.

Do AI chatbots have a sense of welfare or well-being?

Before rolling out Claude Opus 4, Anthropic said it studied the model’s self-reported and behavioural preferences. The AI model showed a “consistent aversion” to harmful prompts from users such as requests to generate sexual content involving minors and information related to terror acts.

Story continues below this ad

Claude Opus 4 showed “a pattern of apparent distress when engaging with real-world users seeking harmful content” and a tendency to end such conversations with the user, as per the company.

“These behaviors primarily arose in cases where users persisted with harmful requests and/or abuse despite Claude repeatedly refusing to comply and attempting to productively redirect the interactions,” the company said.

However, Anthropic has added a disclaimer as well, noting, “We remain highly uncertain about the potential moral status of Claude and other LLMs (Large Language Models), now or in the future.”

This is because framing AI models in terms of their welfare or well-being risks anthropomorphising them. Several researchers argue that today’s LLMs do not possess genuine understanding or reasoning, describing them instead as stochastic systems (with random probability distribution) optimised for predicting the next token.

Story continues below this ad

Anthropic has said it will keep exploring ways to mitigate risks to AI welfare, “in case such welfare is possible.”

Source link

What's Hot

Technologist Rahul Patil Named CTO of Anthropic, Maker of Claude AI

OpenAI Doubles Down on Chip Diversity With AMD, Nvidia Deals

When Your Primary Customer Folds Overnight

Anthropic’s Claude AI will now cut off abusive chats with users for its own ‘welfare’ | Technology News

Technologist Rahul Patil Named CTO of Anthropic, Maker of Claude AI

Anthropic’s Claude AI can now automatically ‘remember’ past chats

Microsoft Adds Anthropic’s Claude AI to Copilot

Tomb of Amenhotep III Reopens After Two-Decade Renovation

Morning Links for October 6, 2025

Sotheby’s to Sell René Magritte Held in Same Collection for 100 years

Former ARTnews Publisher Dies at 97

Technologist Rahul Patil Named CTO of Anthropic, Maker of Claude AI

OpenAI Doubles Down on Chip Diversity With AMD, Nvidia Deals

When Your Primary Customer Folds Overnight

What's Hot

Anthropic’s Claude AI will now cut off abusive chats with users for its own ‘welfare’ | Technology News

Do AI chatbots have a sense of welfare or well-being?

Related Posts

Subscribe to Updates