Most of the attention in AI today is focused on output—what a model generates, how accurate or convincing it is, how well it performs against benchmarks. But for Hagerty, the real ethical tension begins earlier, at the foundation model level. This is the raw infrastructure of modern AI, the base layer of machine learning trained on vast datasets scraped from the web. It is what fuels large language models (LLMs) like ChatGPT and Claude.
“The foundation is where it happens,” Hagerty told me. “That is the first thing the system learns, and if it is full of junk, that junk does not go away.”
These base models are designed to be general-purpose. That is what makes them both powerful and dangerous, Hagerty said. Because they are not built with specific tasks or constraints in mind, they tend to absorb everything, from valuable semantic structures to toxic internet sludge. And once trained, the models are hard to audit. Even their creators often cannot say for sure what a model knows or how it will respond to a given prompt.
Hagerty compared this to pouring a flawed concrete base for a skyscraper. If the mix is wrong from the start, you might not see cracks immediately. But over time, the structure becomes unstable. In AI, the equivalent is brittle behavior, unintended bias or catastrophic misuse once a system is deployed. Without careful shaping early on, a model carries the risks it absorbed during training into every downstream application.
He is not alone in this concern. Researchers from Stanford’s Center for Research on Foundation Models (CRFM) have repeatedly warned about the emergent risks of large-scale training, including bias propagation, knowledge hallucination, data contamination and the difficulty of pinpointing failures. These problems can be mitigated but not eliminated, which makes early design choices, such as data curation, filtering and governance, more critical.
As Hagerty sees it, one of the biggest ethical barriers to meaningful progress is the sheer vagueness of what companies mean when they say ‘AI.’ Ask five product teams what they mean by “AI-powered,” and you will likely get five different answers. Hagerty views this definitional slipperiness as one of the core ethical failures of the current era.
“Most of the time, when people say AI, they mean automation. Or a decision tree. Or an if/else statement,” he said.
The lack of clarity around terms is not an academic quibble. When companies present deterministic software as intelligent reasoning, users tend to trust it. When startups pitch basic search and filter tools as generative models, investors throw money at mirages. Hagerty refers to this as “hype leakage” and sees it as a growing source of confusion and reputational damage.
In regulated industries like finance or healthcare, the consequences can be more severe. If a user is misled into thinking a system has a more profound awareness than it does, they may delegate decisions that should have remained human. The line between tool and agent becomes blurred, and with it, accountability.
This problem also leads to wasted effort. Hagerty cited recent research on the misuse of LLMs for time-series forecasting, a statistical method used to predict future values based on historical data, a task where classical methods remain more accurate and efficient. Yet some companies continue to use LLMs anyway, chasing novelty or signaling innovation.
“You are burning GPUs to get bad answers,” he said. “And worse, you are calling it progress.”
The ethical issue is not just inefficiency. It is a misrepresentation. Teams build products around technology they barely understand, add marketing that overstates their capability and deploy it to users who have no way to evaluate what they are using.