The Wrong Way To Think About Implementing AI Agents

By Sagi Eliyahu

Recently, analysts at Gartner published a bold prediction regarding the future of agentic AI in the enterprise: more than 40% of in-progress agentic AI projects will be canceled by the end of 2027.

This would seem to support other recent findings from related studies on AI agents in the enterprise. Earlier this year, for example, researchers at Carnegie Mellon University conducted an interesting-yet-flawed experiment: They staffed a fake software company, TheAgentCompany, entirely with AI agents. They asked the agents — each powered by a specific LLM — to take on the day-to-day work of a modern software company. They assigned the agents work, and that was about it so far as instruction or orchestration. After that, they asked them to get to work.

Sagi Eliyahu/Tonkean — Sagi Eliyahu of Tonkean

AI agents have been the subject of frenzied excitement in the enterprise, with such prominent CEOs as Mark Benioff, Jensen Huang, Satya Nadella and Mark Zuckerberg all predicting their impending, transformative preeminence.

CMU’s experiment, therefore, garnered lots of interest. But as outlets like Business Insider have reported, the results were not good. The best-performing agent finished just 24% of the jobs assigned to it. Most completed just 10%. It cost each agent on average $6 to complete an individual task, which added up quickly, since the jobs the agents had been assigned required completing many different tasks. Simple tasks stalled due to agents’ inability to overcome unexpected challenges, like dismissing a pop-up ad.

Observers were quick to interpret these results — along with results of still more studies conducted over the past year or so — as evidence that AI agents are perhaps not quite as capable as tech CEOs have made them out to be.

“[AI agents] are clearly not ready for more complex gigs humans excel at,” Futurism’s Joe Wilkins wrote.

Here’s how Business Insider’s Shubham Agarwal put it: “The findings, along with other emerging research about AI agents, complicate the idea that an AI agent workforce is just around the corner — there’s a lot of work they simply aren’t good at.”

Agarwal concluded the experiment was a “total disaster.”

This, however, is the incorrect conclusion to draw — incomplete at best and irrelevant at its core.

Augment, don’t replace

That’s because it stems from a flawed premise — specifically, that AI agents should be expected to replace humans outright. They’re not. They’re meant to augment them.

The agents in CMU’s experiment, in other words, were set up to fail. The culprit in the experiment was not the capacity of the agents themselves, but a misapplication of their purpose.

This, interestingly, is what underpins Gartner’s recent research into AI agents in the enterprise. According to Anushree Verma, a senior director analyst at Gartner, many in-progress AI agent deployments will fail ultimately because, “They are mostly driven by hype and are often misapplied.”

What CMU’s experiment ultimately showcases is precisely this: what happens when agentic AI rollouts stem foremost from such misapplication. It proves not that agents can’t complete complex work, but rather that CMU simply attempted to implement AI agents in entirely the wrong way.

So what’s a better way? To start, we shouldn’t treat this technology as magic.

AI agents, simply put, are tools. They’re not human replacements. They’re things for humans to use.

And just like any tool, the value humans derive from agents comes down not just to how smart or powerful individual agents are, but how strategically we leverage them to improve our own capacity.

Setting a bunch of specialized AI agents loose inside an organization without structures governing how they should work with each other or with human workers — not to mention without connecting them to the various departments, systems and policy centers, such that they can be orchestrated across them — simply isn’t very strategic.

In fact, it’s not a strategic way to leverage any tool, resource or intelligent entity, humans included. Try the same experiment, but substitute AI agents with highly intelligent human workers. Let those workers loose inside your organization without roles, responsibilities, organization or protocol, and you’ll get the same result: noisy, inefficient, expensive chaos.

LLMs can’t deliver consistently good work or work effectively together toward a common set of goals without other supporting technology or infrastructure.

So what might be more useful instead? If the goal is to determine what, ultimately, AI agents are capable of in an enterprise context, we should experiment with them using conditions consistent with an enterprise context. And we should ensure there’s adequate structure in place behind the scenes — such as end-to-end orchestration infrastructure — enabling AI agents to deliver genuine enterprise value.

Structure and strategy matter

People who believe AI agents are exciting because they’ll replace humans have it all wrong. AI agents are exciting not because they’ll replace humans, but because they’ll replace traditional enterprise software.

It’s in this way that AI agents could transform the enterprise — by improving not just the capacity of human-led organizations, but the experiences provided human workers inside them.

But only if we will it. For organizations of every sort — from TheAgentCompany to Alphabet to those surveyed by Gartner — getting transformational value out of agents will come down to one thing: how strategically we integrate them into the infrastructure of our day-to-day operations, and what sort of structures we put in place to govern them.

This is as true of AI agents as it is of any other sort of intelligent entity we leverage inside the enterprise, including humans. Intelligent entities need structure to work effectively. You want intelligent entities to be able to work autonomously and creatively in pursuit of the goals you set for them. But to effectively pursue those goals, you also need direction and hierarchy, governance and org charts, processes and rules.

It’s on what sort of structure we put around AI agents, in order to maximize their impact for humans, that we should be iterating and experimenting.

This is a matter, in the end, not only of strategy and performance, but of security; thinking carefully about how we construct and deploy AI agents in the enterprise, for example, is how we will wall off AI from things it shouldn’t be touching internally, such as login credentials, sensitive data or certain actions.

It’s also, however, the only way we’ll ever truly find out just what this technology is capable of. Anything else is a waste of time.

Sagi Eliyahu is the co-founder and CEO of Tonkean, an AI-powered intake and orchestration platform that helps enterprise-shared service teams such as procurement, legal, IT and HR create processes that people actually follow. Tonkean’s agents use AI to anticipate employees’ needs and guide them through their requests.

What's Hot

AI Testing and Evaluation: Reflections

Grok’s AI companions drove downloads, but its latest model is the one making money

Bias Crisis in Talent Acquisition

The Wrong Way To Think About Implementing AI Agents

Manufacturing, AI And Publishing Attract Investor Dollars

Startup M&A Crests Higher In First Half Of 2025

Lovable, A Swedish AI Vibe Coding Startup, Becomes Unicorn With $200M Series A

Fine Arts Museums of San Francisco Lay Off 12 Staff

Sam Gilliam Foundation, David Kordansky Sued Over ‘Disavowed’ Painting

Donors Reportedly Pulling Support from Florida University Museum after its Controversial Transfer

What will come of the Guggenheim Asher legal battle?

AI Testing and Evaluation: Reflections

Grok’s AI companions drove downloads, but its latest model is the one making money

Bias Crisis in Talent Acquisition

What's Hot

The Wrong Way To Think About Implementing AI Agents

Augment, don’t replace

Structure and strategy matter

Related reading:

Related Posts

Subscribe to Updates