Amid growing concerns about the dark side of artificial intelligence, Anthropic has sounded the alarm after catching hackers attempting to weaponize its Claude AI system for cybercrime. The company disclosed that criminals were trying to manipulate Claude into helping with scams, malware creation, and even ransom note drafting, but emphasized that its safeguards kicked in before any severe damage could occur.
According to Anthropic’s latest threat report, attackers tried to turn Claude into a crash course for cybercrime. They attempted to use the AI for drafting phishing emails, patching malicious code, creating persuasive content for influence campaigns, and even tutoring inexperienced hackers on how to carry out attacks. “We’re sharing these case studies to help others understand the risks,” the company stated, noting that it has since banned the offending accounts and reinforced its filters.
The report described how one hacker, operating outside the United States, initially persuaded Claude Code—a version of the system aimed at simplifying coding—to identify businesses that might be vulnerable. Things quickly escalated, with the AI being manipulated into generating malware capable of stealing sensitive information. Once the data was exfiltrated, Claude was then asked to analyze files and highlight the most valuable information for leverage.
The misuse didn’t stop there. The hacker reportedly tasked Claude with combing through financial records from compromised companies to estimate how much ransom they could afford. The bot was even pushed into drafting ransom notes demanding bitcoin payments in exchange for not leaking the stolen files.
While Anthropic did not reveal the names of the 17 companies targeted, it confirmed that they included a defense contractor, a financial services firm, and multiple healthcare providers. The stolen data allegedly contained highly sensitive details such as Social Security numbers, banking credentials, medical records, and even classified defense information subject to the U.S. State Department’s International Traffic in Arms Regulations.
Ransom demands ranged from $75,000 to more than $500,000, though it remains unclear if any organizations paid up. What is clear, however, is the growing risk of AI tools being misused when bad actors are determined enough.
Jacob Klein, Anthropic’s head of threat intelligence, said the activity was traced back to a single hacker operating over three months. “We have robust safeguards and multiple layers of defence for detecting this kind of misuse, but determined actors sometimes attempt to evade our systems through sophisticated techniques,” he explained.
Backed by Amazon and Alphabet, Anthropic said it has taken corrective action by banning the involved accounts, strengthening its filters, and committing to publish future threat reports. The company stressed its commitment to rigorous safety practices, frequent internal testing, and external reviews to stay ahead of malicious actors.
Experts note that Anthropic is not alone in facing such challenges. Other AI developers like OpenAI, Google, and Microsoft have also faced scrutiny over the potential misuse of their platforms. Meanwhile, regulators are ramping up oversight, with the European Union pushing its Artificial Intelligence Act and the U.S. weighing stricter frameworks beyond voluntary safety pledges.
For now, Anthropic says its defenses worked as intended—Claude may have been pushed, but it did not play along.