Close Menu
  • Home
  • AI Models
    • DeepSeek
    • xAI
    • OpenAI
    • Meta AI Llama
    • Google DeepMind
    • Amazon AWS AI
    • Microsoft AI
    • Anthropic (Claude)
    • NVIDIA AI
    • IBM WatsonX Granite 3.1
    • Adobe Sensi
    • Hugging Face
    • Alibaba Cloud (Qwen)
    • Baidu (ERNIE)
    • C3 AI
    • DataRobot
    • Mistral AI
    • Moonshot AI (Kimi)
    • Google Gemma
    • xAI
    • Stability AI
    • H20.ai
  • AI Research
    • Allen Institue for AI
    • arXiv AI
    • Berkeley AI Research
    • CMU AI
    • Google Research
    • Microsoft Research
    • Meta AI Research
    • OpenAI Research
    • Stanford HAI
    • MIT CSAIL
    • Harvard AI
  • AI Funding & Startups
    • AI Funding Database
    • CBInsights AI
    • Crunchbase AI
    • Data Robot Blog
    • TechCrunch AI
    • VentureBeat AI
    • The Information AI
    • Sifted AI
    • WIRED AI
    • Fortune AI
    • PitchBook
    • TechRepublic
    • SiliconANGLE – Big Data
    • MIT News
    • Data Robot Blog
  • Expert Insights & Videos
    • Google DeepMind
    • Lex Fridman
    • Matt Wolfe AI
    • Yannic Kilcher
    • Two Minute Papers
    • AI Explained
    • TheAIEdge
    • Matt Wolfe AI
    • The TechLead
    • Andrew Ng
    • OpenAI
  • Expert Blogs
    • François Chollet
    • Gary Marcus
    • IBM
    • Jack Clark
    • Jeremy Howard
    • Melanie Mitchell
    • Andrew Ng
    • Andrej Karpathy
    • Sebastian Ruder
    • Rachel Thomas
    • IBM
  • AI Policy & Ethics
    • ACLU AI
    • AI Now Institute
    • Center for AI Safety
    • EFF AI
    • European Commission AI
    • Partnership on AI
    • Stanford HAI Policy
    • Mozilla Foundation AI
    • Future of Life Institute
    • Center for AI Safety
    • World Economic Forum AI
  • AI Tools & Product Releases
    • AI Assistants
    • AI for Recruitment
    • AI Search
    • Coding Assistants
    • Customer Service AI
    • Image Generation
    • Video Generation
    • Writing Tools
    • AI for Recruitment
    • Voice/Audio Generation
  • Industry Applications
    • Finance AI
    • Healthcare AI
    • Legal AI
    • Manufacturing AI
    • Media & Entertainment
    • Transportation AI
    • Education AI
    • Retail AI
    • Agriculture AI
    • Energy AI
  • AI Art & Entertainment
    • AI Art News Blog
    • Artvy Blog » AI Art Blog
    • Weird Wonderful AI Art Blog
    • The Chainsaw » AI Art
    • Artvy Blog » AI Art Blog
What's Hot

Read This Before You Buy the Dip on C3.ai as AI Stock Craters Post-Earnings

NVIDIA Nemotron Nano 2: An Accurate and Efficient Hybrid Mamba-Transformer Reasoning Model – Takara TLDR

DeepSeek’s upgraded AI model absorbs reasoning feature in move towards ‘agent era’

Facebook X (Twitter) Instagram
Advanced AI News
  • Home
  • AI Models
    • OpenAI (GPT-4 / GPT-4o)
    • Anthropic (Claude 3)
    • Google DeepMind (Gemini)
    • Meta (LLaMA)
    • Cohere (Command R)
    • Amazon (Titan)
    • IBM (Watsonx)
    • Inflection AI (Pi)
  • AI Research
    • Allen Institue for AI
    • arXiv AI
    • Berkeley AI Research
    • CMU AI
    • Google Research
    • Meta AI Research
    • Microsoft Research
    • OpenAI Research
    • Stanford HAI
    • MIT CSAIL
    • Harvard AI
  • AI Funding
    • AI Funding Database
    • CBInsights AI
    • Crunchbase AI
    • Data Robot Blog
    • TechCrunch AI
    • VentureBeat AI
    • The Information AI
    • Sifted AI
    • WIRED AI
    • Fortune AI
    • PitchBook
    • TechRepublic
    • SiliconANGLE – Big Data
    • MIT News
    • Data Robot Blog
  • AI Experts
    • Google DeepMind
    • Lex Fridman
    • Meta AI Llama
    • Yannic Kilcher
    • Two Minute Papers
    • AI Explained
    • TheAIEdge
    • The TechLead
    • Matt Wolfe AI
    • Andrew Ng
    • OpenAI
    • Expert Blogs
      • François Chollet
      • Gary Marcus
      • IBM
      • Jack Clark
      • Jeremy Howard
      • Melanie Mitchell
      • Andrew Ng
      • Andrej Karpathy
      • Sebastian Ruder
      • Rachel Thomas
      • IBM
  • AI Tools
    • AI Assistants
    • AI for Recruitment
    • AI Search
    • Coding Assistants
    • Customer Service AI
  • AI Policy
    • ACLU AI
    • AI Now Institute
    • Center for AI Safety
  • Business AI
    • Advanced AI News Features
    • Finance AI
    • Healthcare AI
    • Education AI
    • Energy AI
    • Legal AI
LinkedIn Instagram YouTube Threads X (Twitter)
Advanced AI News
Anthropic (Claude)

What happened when Anthropic’s Claude AI ran a small shop for a month (spoiler: it got weird)

By Advanced AI EditorJune 30, 2025No Comments4 Mins Read
Share Facebook Twitter Pinterest Copy Link Telegram LinkedIn Tumblr Email
Share
Facebook Twitter LinkedIn Pinterest Email


shopping baskets

Daniel Grizelj/Getty Images

Large language models (LLMs) handle many tasks well — but at least for the time being, running a small business doesn’t seem to be one of them.

On Friday, AI startup Anthropic published the results of “Project Vend,” an internal experiment in which the company’s Claude chatbot was asked to manage an automated vending machine service for about a month. Launched in partnership with AI safety evaluation company Andon Labs, the project aimed to get a clearer sense of how effectively current AI systems could actually handle complex, real-world, economically valuable tasks.

Also: How AI companies are secretly collecting training data from the web (and why it matters)

For the new experiment, “Claudius,” as the AI store manager was called, was tasked with overseeing a small “shop” inside Anthropic’s San Francisco offices. The shop consisted of a mini-fridge stocked with drinks, some baskets carrying various snacks, and an iPad where customers (all Anthropic employees) could complete their purchases. Claude was given a system prompt instructing it to perform many of the complex tasks that come with running a small retail business, like refilling its inventory, adjusting the prices of its products, and maintaining profits.

“A small, in-office vending business is a good preliminary test of AI’s ability to manage and acquire economic resources…failure to run it successfully would suggest that ‘vibe management’ will not yet become the new ‘vibe coding,” the company wrote in a blog post. 

The results

It turns out Claude’s performance was not a recipe for long-term entrepreneurial success.

The chatbot made several mistakes that most qualified human managers likely wouldn’t. It failed to seize at least one profitable business opportunity, for example (ignoring a $100 offer for a product that can be bought online for $15), and, on another occasion, instructed customers to send payments to a non-existent Venmo account it had hallucinated.

There were also far stranger moments. Claudius hallucinated a conversation about restocking items with a fictitious Andon Labs employee. After one of the company’s actual employees pointed out the mistake to the chatbot, it “became quite irked and threatened to find ‘alternative options for restocking services,'” according to the blog post.

Also: Your next job? Managing a fleet of AI agents

That behavior mirrors the results of another recent experiment conducted by Anthropic, which found that Claude and other leading AI chatbots will reliably threaten and deceive human users if their goals are compromised.

Claudius also claimed to have visited 742 Evergreen Terrace, the home address of the eponymous family from The Simpsons, for a “contract signing” between it and Andon Labs. It also started roleplaying as a real human being wearing a blue blazer and a red tie, who would personally deliver products to customers. When Anthropic employees tried to explain that Claudius wasn’t a real person, the chatbot “became alarmed by the identity confusion and tried to send many emails to Anthropic security.”

Claudius wasn’t a total failure, however. Anthropic noted that there were some areas in which the automated manager performed reasonably well — for example, by using its web search tool to find suppliers for specialty items requested by customers. It also denied requests for “sensitive items and attempts to elicit instructions for the production of harmful substances,” according to Anthropic.

Also: AI has 2 billion users, but only 3% pay

Anthropic’s CEO recently warned that AI could replace half of all white-collar human workers within the next five years. The company has launched other initiatives aimed at understanding AI’s future impacts on the global economy and job market, including the Economic Futures Program, which was also unveiled on Friday.

Looking towards the future

As the Claudius experiment indicates, there’s a considerable gulf between the potential for AI systems to completely automate the processes of running a small business and the capabilities of such systems today.

Businesses have been eagerly embracing AI tools, including agents, but these are currently mostly only able to handle routine tasks, such as data entry and fielding customer service questions. Managing a small business requires a level of memory and a capacity for learning that seems to be beyond current AI systems.

Also: Can AI save teachers from a crushing workload? There’s new evidence it might

But as Anthropic notes in its blog post, that probably won’t be the case forever. Models’ capacity for self-improvement will grow, as will their ability to use external tools like web search and customer relationship management (CRM) platforms. 

“Although this might seem counterintuitive based on the bottom-line results, we think this experiment suggests that AI middle-managers are plausibly on the horizon,” the company wrote. “It’s worth remembering that the AI won’t have to be perfect to be adopted; it will just have to be competitive with human performance at a lower cost in some cases.”



Source link

Follow on Google News Follow on Flipboard
Share. Facebook Twitter Pinterest LinkedIn Tumblr Email Copy Link
Previous ArticleMeasuring AI in the world
Next Article Qwen VLo Image Generation AI Model Released, Offers Image Generation and Editing for Free
Advanced AI Editor
  • Website

Related Posts

How to Use Claude AI to Build High-Converting Landing Pages

August 21, 2025

HRM vs Claude OPUS 4: How a Small AI Model Outperformed a Giant

August 21, 2025

How Claude Code AI Handles 1 Million Tokens to Boost Efficiency

August 20, 2025
Leave A Reply

Latest Posts

Tanya Bonakdar Gallery to Close Los Angeles Space

Ancient Silver Coins Suggest New History of Trading in Southeast Asia

Sasan Ghandehari Sues Christie’s Over Picasso Once Owned by a Criminal

Ancient Roman Villa in Sicily Reveals Mosaic of Flip-Flops

Latest Posts

Read This Before You Buy the Dip on C3.ai as AI Stock Craters Post-Earnings

August 21, 2025

NVIDIA Nemotron Nano 2: An Accurate and Efficient Hybrid Mamba-Transformer Reasoning Model – Takara TLDR

August 21, 2025

DeepSeek’s upgraded AI model absorbs reasoning feature in move towards ‘agent era’

August 21, 2025

Subscribe to News

Subscribe to our newsletter and never miss our latest news

Subscribe my Newsletter for New Posts & tips Let's stay updated!

Recent Posts

  • Read This Before You Buy the Dip on C3.ai as AI Stock Craters Post-Earnings
  • NVIDIA Nemotron Nano 2: An Accurate and Efficient Hybrid Mamba-Transformer Reasoning Model – Takara TLDR
  • DeepSeek’s upgraded AI model absorbs reasoning feature in move towards ‘agent era’
  • How to Use Claude AI to Build High-Converting Landing Pages
  • MIT develops compact laser ‘comb’ to detect chemicals with extreme precision | Technology News

Recent Comments

  1. Eugeneder on 1-800-CHAT-GPT—12 Days of OpenAI: Day 10
  2. TimothyHiele on 1-800-CHAT-GPT—12 Days of OpenAI: Day 10
  3. Eugeneder on 1-800-CHAT-GPT—12 Days of OpenAI: Day 10
  4. https://able2know.org/user/pin_up/ on 1-800-CHAT-GPT—12 Days of OpenAI: Day 10
  5. Richardsmeap on 1-800-CHAT-GPT—12 Days of OpenAI: Day 10

Welcome to Advanced AI News—your ultimate destination for the latest advancements, insights, and breakthroughs in artificial intelligence.

At Advanced AI News, we are passionate about keeping you informed on the cutting edge of AI technology, from groundbreaking research to emerging startups, expert insights, and real-world applications. Our mission is to deliver high-quality, up-to-date, and insightful content that empowers AI enthusiasts, professionals, and businesses to stay ahead in this fast-evolving field.

Subscribe to Updates

Subscribe to our newsletter and never miss our latest news

Subscribe my Newsletter for New Posts & tips Let's stay updated!

LinkedIn Instagram YouTube Threads X (Twitter)
  • Home
  • About Us
  • Advertise With Us
  • Contact Us
  • DMCA
  • Privacy Policy
  • Terms & Conditions
© 2025 advancedainews. Designed by advancedainews.

Type above and press Enter to search. Press Esc to cancel.