Close Menu
  • Home
  • AI Models
    • DeepSeek
    • xAI
    • OpenAI
    • Meta AI Llama
    • Google DeepMind
    • Amazon AWS AI
    • Microsoft AI
    • Anthropic (Claude)
    • NVIDIA AI
    • IBM WatsonX Granite 3.1
    • Adobe Sensi
    • Hugging Face
    • Alibaba Cloud (Qwen)
    • Baidu (ERNIE)
    • C3 AI
    • DataRobot
    • Mistral AI
    • Moonshot AI (Kimi)
    • Google Gemma
    • xAI
    • Stability AI
    • H20.ai
  • AI Research
    • Allen Institue for AI
    • arXiv AI
    • Berkeley AI Research
    • CMU AI
    • Google Research
    • Microsoft Research
    • Meta AI Research
    • OpenAI Research
    • Stanford HAI
    • MIT CSAIL
    • Harvard AI
  • AI Funding & Startups
    • AI Funding Database
    • CBInsights AI
    • Crunchbase AI
    • Data Robot Blog
    • TechCrunch AI
    • VentureBeat AI
    • The Information AI
    • Sifted AI
    • WIRED AI
    • Fortune AI
    • PitchBook
    • TechRepublic
    • SiliconANGLE – Big Data
    • MIT News
    • Data Robot Blog
  • Expert Insights & Videos
    • Google DeepMind
    • Lex Fridman
    • Matt Wolfe AI
    • Yannic Kilcher
    • Two Minute Papers
    • AI Explained
    • TheAIEdge
    • Matt Wolfe AI
    • The TechLead
    • Andrew Ng
    • OpenAI
  • Expert Blogs
    • François Chollet
    • Gary Marcus
    • IBM
    • Jack Clark
    • Jeremy Howard
    • Melanie Mitchell
    • Andrew Ng
    • Andrej Karpathy
    • Sebastian Ruder
    • Rachel Thomas
    • IBM
  • AI Policy & Ethics
    • ACLU AI
    • AI Now Institute
    • Center for AI Safety
    • EFF AI
    • European Commission AI
    • Partnership on AI
    • Stanford HAI Policy
    • Mozilla Foundation AI
    • Future of Life Institute
    • Center for AI Safety
    • World Economic Forum AI
  • AI Tools & Product Releases
    • AI Assistants
    • AI for Recruitment
    • AI Search
    • Coding Assistants
    • Customer Service AI
    • Image Generation
    • Video Generation
    • Writing Tools
    • AI for Recruitment
    • Voice/Audio Generation
  • Industry Applications
    • Finance AI
    • Healthcare AI
    • Legal AI
    • Manufacturing AI
    • Media & Entertainment
    • Transportation AI
    • Education AI
    • Retail AI
    • Agriculture AI
    • Energy AI
  • AI Art & Entertainment
    • AI Art News Blog
    • Artvy Blog » AI Art Blog
    • Weird Wonderful AI Art Blog
    • The Chainsaw » AI Art
    • Artvy Blog » AI Art Blog
What's Hot

Reinforcement Learning with OpenAI’s Gym | Two Minute Papers #72

Get Out of Tech. IT’S OVER.

Robert Rodriguez: Sin City, Desperado, El Mariachi, Alita, and Filmmaking | Lex Fridman Podcast #465

Facebook X (Twitter) Instagram
Advanced AI News
  • Home
  • AI Models
    • Amazon (Titan)
    • Anthropic (Claude 3)
    • Cohere (Command R)
    • Google DeepMind (Gemini)
    • IBM (Watsonx)
    • Inflection AI (Pi)
    • Meta (LLaMA)
    • OpenAI (GPT-4 / GPT-4o)
    • Reka AI
    • xAI (Grok)
    • Adobe Sensi
    • Aleph Alpha
    • Alibaba Cloud (Qwen)
    • Apple Core ML
    • Baidu (ERNIE)
    • ByteDance Doubao
    • C3 AI
    • DataRobot
    • DeepSeek
  • AI Research & Breakthroughs
    • Allen Institue for AI
    • arXiv AI
    • Berkeley AI Research
    • CMU AI
    • Google Research
    • Meta AI Research
    • Microsoft Research
    • OpenAI Research
    • Stanford HAI
    • MIT CSAIL
    • Harvard AI
  • AI Funding & Startups
    • AI Funding Database
    • CBInsights AI
    • Crunchbase AI
    • Data Robot Blog
    • TechCrunch AI
    • VentureBeat AI
    • The Information AI
    • Sifted AI
    • WIRED AI
    • Fortune AI
    • PitchBook
    • TechRepublic
    • SiliconANGLE – Big Data
    • MIT News
    • Data Robot Blog
  • Expert Insights & Videos
    • Google DeepMind
    • Lex Fridman
    • Meta AI Llama
    • Yannic Kilcher
    • Two Minute Papers
    • AI Explained
    • TheAIEdge
    • Matt Wolfe AI
    • The TechLead
    • Andrew Ng
    • OpenAI
  • Expert Blogs
    • François Chollet
    • Gary Marcus
    • IBM
    • Jack Clark
    • Jeremy Howard
    • Melanie Mitchell
    • Andrew Ng
    • Andrej Karpathy
    • Sebastian Ruder
    • Rachel Thomas
    • IBM
  • AI Policy & Ethics
    • ACLU AI
    • AI Now Institute
    • Center for AI Safety
    • EFF AI
    • European Commission AI
    • Partnership on AI
    • Stanford HAI Policy
    • Mozilla Foundation AI
    • Future of Life Institute
    • Center for AI Safety
    • World Economic Forum AI
  • AI Tools & Product Releases
    • AI Assistants
    • AI for Recruitment
    • AI Search
    • Coding Assistants
    • Customer Service AI
    • Image Generation
    • Video Generation
    • Writing Tools
    • AI for Recruitment
    • Voice/Audio Generation
  • Industry Applications
    • Education AI
    • Energy AI
    • Finance AI
    • Healthcare AI
    • Legal AI
    • Media & Entertainment
    • Transportation AI
    • Manufacturing AI
    • Retail AI
    • Agriculture AI
  • AI Art & Entertainment
    • AI Art News Blog
    • Artvy Blog » AI Art Blog
    • Weird Wonderful AI Art Blog
    • The Chainsaw » AI Art
    • Artvy Blog » AI Art Blog
Facebook X (Twitter) Instagram
Advanced AI News
Anthropic (Claude)

What happened when Anthropic’s Claude AI ran a small shop for a month (spoiler: it got weird)

Advanced AI EditorBy Advanced AI EditorJune 30, 2025No Comments4 Mins Read
Share Facebook Twitter Pinterest Copy Link Telegram LinkedIn Tumblr Email
Share
Facebook Twitter LinkedIn Pinterest Email


shopping baskets

Daniel Grizelj/Getty Images

Large language models (LLMs) handle many tasks well — but at least for the time being, running a small business doesn’t seem to be one of them.

On Friday, AI startup Anthropic published the results of “Project Vend,” an internal experiment in which the company’s Claude chatbot was asked to manage an automated vending machine service for about a month. Launched in partnership with AI safety evaluation company Andon Labs, the project aimed to get a clearer sense of how effectively current AI systems could actually handle complex, real-world, economically valuable tasks.

Also: How AI companies are secretly collecting training data from the web (and why it matters)

For the new experiment, “Claudius,” as the AI store manager was called, was tasked with overseeing a small “shop” inside Anthropic’s San Francisco offices. The shop consisted of a mini-fridge stocked with drinks, some baskets carrying various snacks, and an iPad where customers (all Anthropic employees) could complete their purchases. Claude was given a system prompt instructing it to perform many of the complex tasks that come with running a small retail business, like refilling its inventory, adjusting the prices of its products, and maintaining profits.

“A small, in-office vending business is a good preliminary test of AI’s ability to manage and acquire economic resources…failure to run it successfully would suggest that ‘vibe management’ will not yet become the new ‘vibe coding,” the company wrote in a blog post. 

The results

It turns out Claude’s performance was not a recipe for long-term entrepreneurial success.

The chatbot made several mistakes that most qualified human managers likely wouldn’t. It failed to seize at least one profitable business opportunity, for example (ignoring a $100 offer for a product that can be bought online for $15), and, on another occasion, instructed customers to send payments to a non-existent Venmo account it had hallucinated.

There were also far stranger moments. Claudius hallucinated a conversation about restocking items with a fictitious Andon Labs employee. After one of the company’s actual employees pointed out the mistake to the chatbot, it “became quite irked and threatened to find ‘alternative options for restocking services,'” according to the blog post.

Also: Your next job? Managing a fleet of AI agents

That behavior mirrors the results of another recent experiment conducted by Anthropic, which found that Claude and other leading AI chatbots will reliably threaten and deceive human users if their goals are compromised.

Claudius also claimed to have visited 742 Evergreen Terrace, the home address of the eponymous family from The Simpsons, for a “contract signing” between it and Andon Labs. It also started roleplaying as a real human being wearing a blue blazer and a red tie, who would personally deliver products to customers. When Anthropic employees tried to explain that Claudius wasn’t a real person, the chatbot “became alarmed by the identity confusion and tried to send many emails to Anthropic security.”

Claudius wasn’t a total failure, however. Anthropic noted that there were some areas in which the automated manager performed reasonably well — for example, by using its web search tool to find suppliers for specialty items requested by customers. It also denied requests for “sensitive items and attempts to elicit instructions for the production of harmful substances,” according to Anthropic.

Also: AI has 2 billion users, but only 3% pay

Anthropic’s CEO recently warned that AI could replace half of all white-collar human workers within the next five years. The company has launched other initiatives aimed at understanding AI’s future impacts on the global economy and job market, including the Economic Futures Program, which was also unveiled on Friday.

Looking towards the future

As the Claudius experiment indicates, there’s a considerable gulf between the potential for AI systems to completely automate the processes of running a small business and the capabilities of such systems today.

Businesses have been eagerly embracing AI tools, including agents, but these are currently mostly only able to handle routine tasks, such as data entry and fielding customer service questions. Managing a small business requires a level of memory and a capacity for learning that seems to be beyond current AI systems.

Also: Can AI save teachers from a crushing workload? There’s new evidence it might

But as Anthropic notes in its blog post, that probably won’t be the case forever. Models’ capacity for self-improvement will grow, as will their ability to use external tools like web search and customer relationship management (CRM) platforms. 

“Although this might seem counterintuitive based on the bottom-line results, we think this experiment suggests that AI middle-managers are plausibly on the horizon,” the company wrote. “It’s worth remembering that the AI won’t have to be perfect to be adopted; it will just have to be competitive with human performance at a lower cost in some cases.”



Source link

Follow on Google News Follow on Flipboard
Share. Facebook Twitter Pinterest LinkedIn Tumblr Email Copy Link
Previous ArticleMeasuring AI in the world
Next Article Qwen VLo Image Generation AI Model Released, Offers Image Generation and Editing for Free
Advanced AI Editor
  • Website

Related Posts

How Claude AI Clawed Through Millions Of Books

June 29, 2025

New study reveals how many people are using AI for companionship — and the results are surprising

June 28, 2025

Can AI save teachers from a crushing workload? There’s new evidence it might

June 28, 2025
Leave A Reply Cancel Reply

Latest Posts

‘The Joan’ At Liberty Station

A Modernist Masterpiece Hotel Is Hiding In Plain Sight On The Thames

Brice Arsène Yonkeu Brings Diaspora Dialogue to Gagosian Park & 75

Vatican Unveils Last of Four Restored Raphael Rooms

Latest Posts

Reinforcement Learning with OpenAI’s Gym | Two Minute Papers #72

July 1, 2025

Get Out of Tech. IT’S OVER.

July 1, 2025

Robert Rodriguez: Sin City, Desperado, El Mariachi, Alita, and Filmmaking | Lex Fridman Podcast #465

July 1, 2025

Subscribe to News

Subscribe to our newsletter and never miss our latest news

Subscribe my Newsletter for New Posts & tips Let's stay updated!

Recent Posts

  • Reinforcement Learning with OpenAI’s Gym | Two Minute Papers #72
  • Get Out of Tech. IT’S OVER.
  • Robert Rodriguez: Sin City, Desperado, El Mariachi, Alita, and Filmmaking | Lex Fridman Podcast #465
  • Inside Baidu’s Open-Source AI Push
  • Build and deploy AI inference workflows with new enhancements to the Amazon SageMaker Python SDK

Recent Comments

No comments to show.

Welcome to Advanced AI News—your ultimate destination for the latest advancements, insights, and breakthroughs in artificial intelligence.

At Advanced AI News, we are passionate about keeping you informed on the cutting edge of AI technology, from groundbreaking research to emerging startups, expert insights, and real-world applications. Our mission is to deliver high-quality, up-to-date, and insightful content that empowers AI enthusiasts, professionals, and businesses to stay ahead in this fast-evolving field.

Subscribe to Updates

Subscribe to our newsletter and never miss our latest news

Subscribe my Newsletter for New Posts & tips Let's stay updated!

YouTube LinkedIn
  • Home
  • About Us
  • Advertise With Us
  • Contact Us
  • DMCA
  • Privacy Policy
  • Terms & Conditions
© 2025 advancedainews. Designed by advancedainews.

Type above and press Enter to search. Press Esc to cancel.