2
By Cary Springfield, International Banker
Just as Donald Trump was settling into his second term as US president and talking up a mammoth $500-billion investment in the Stargate Project (an AI-focused joint venture of OpenAI, SoftBank, Oracle and MGX), a little-known start-up from China called DeepSeek managed to not only overshadow the United States president’s plans but has also since comprehensively upended the global AI landscape. By introducing the world to DeepSeek-R1—a powerful logical-reasoning large language model (LLM)—with substantially less money and computing power than it took to launch rival LLMs, such as OpenAI’s ChatGPT and Google’s Gemini, DeepSeek has sent massive shockwaves across the world.
The start-up was launched in July 2023 by Chinese entrepreneur Liang Wenfeng and backed by his quantitative hedge fund, High-Flyer, with the goal of building a powerful LLM that was comparable in capability to ChatGPT, acquiring thousands of chips from Nvidia in its quest to do so. In late December 2024, the company introduced its foundational DeepSeek-V3 model, which performed reasonably well in tests involving mathematics, computer coding and understanding texts when compared to advanced models such as OpenAI’s GPT-4o, Anthropic’s Claude 3.5 Sonnet and Meta’s Llama 3.
It was after the introduction of the more advanced R1 model on January 20, however, that the world sat up and took notice of DeepSeek. A research paper revealed that the company built its technology using only a fraction of the computer chips that US companies employed to train their own systems. While DeepSeek-V3 required a reported 2,000 Nvidia H800 graphics-processing units (GPUs) to be trained, for instance, Meta’s Llama 3 used a whopping 16,000 H100 GPUs. Reports subsequently emerged that highlighted other widely lauded methodologies employed by DeepSeek, including its use of efficient algorithms, optimised GPU allocation and high computational efficiency, particularly through an AI-training technique called Mixture of Experts (MoE).
The company’s achievements are even more impressive given the punitive measures implemented by the US government restricting China from accessing the most advanced Nvidia AI chips. Nonetheless, DeepSeek adapted to this roadblock. “Forced to work with less powerful but more available H800 GPUs, the company optimized its model to run on lower-end hardware without sacrificing performance,” Arun Rai, director of the Center for Digital Innovation at Georgia State University’s Robinson College of Business (RCB), explained. “DeepSeek didn’t just launch an AI model—it reshaped the AI conversation, showing that optimisation, smarter software, and open access can be just as transformative as massive computing power.”
Then there is DeepSeek’s ability to resolve a major long-running challenge facing AI models—that is, to be able to reason sequentially. “Traditionally, LLMs have been trained on a very compute-intensive process called supervised learning, where models are fed immense quantities of labelled data and then match inputs to correctly labelled outputs,” Ian Mortimer and Matthew Page, portfolio managers for Guinness Global Investors, wrote in a February 24 report for the UK-based fund-management firm. “In contrast, DeepSeek’s reasoning model was accomplished using a technique called reinforcement learning, where responses are fine-tuned by rewarding accurate outputs and penalising mistakes.”
However, arguably the most profound implications that DeepSeek has for the AI landscape regard costs, with the company confirming that only $5.5 million was spent to train its new system—substantially lower than the $100 million-plus training budget that ChatGPT reportedly needed for GPT-4o. With R1 demonstrating that the financial barriers to entry preventing other companies from developing powerful AI models of their own were not as prohibitive as had previously been believed, therefore, DeepSeek had transformed AI development in an instant.
And investors knew it. The erasure of almost $1 trillion in US stock-market capitalisation in late January underscored the threat that low-cost Chinese AI models could pose to the dominance commanded by AI giants such as Nvidia. Indeed, the US chip mammoth saw nearly $600 billion of market capitalisation wiped out shortly after R1’s launch, marking the largest single-day loss for a company in stock-market history.
“I think the market responded to R1, as in, ‘Oh my gosh. AI is finished. You know, it dropped out of the sky. We don’t need to do any computing anymore.’ It’s exactly the opposite. It’s [the] complete opposite,” Nvidia’s founder and chief executive officer, Jensen Huang, said when speaking with Alex Bouzari, chief executive of DataDirect Networks (DDN), in an interview held on February 21. Huang added that R1’s release would be inherently positive for the AI market, spurring AI adoption. “It’s making everybody take notice that, okay, there are opportunities to have the models be far more efficient than what we thought was possible. And so, it’s expanding, and it’s accelerating the adoption of AI.”
Some view DeepSeek as not just another AI competitor but a monumental paradigm shift in the technology’s development. “The team behind it is composed of engineers who aren’t bound by traditional AI methodologies,” Par Botes, the vice president of AI infrastructure at Pure Storage, explained in a blog piece for the US tech firm. “They are mathematicians at heart, pragmatists in execution, and unconcerned with how AI has been done in the past. They saw a new way forward, one that redefines how we think about AI models, data management, and ultimately, the infrastructure that powers it all.”
In that regard, the rise of smaller, more specialised AI models can be envisioned, rather than only an environment of all-encompassing models trained liberally on every type of data. “The distillation of large language models (LLMs) into small language models (SLMs) could lead to thousands or tens of thousands of SLMs equipped with reasoning functionality,” a February 2025 deep-dive report into DeepSeek from Research and Markets concluded. “This will create hardware solutions that are more cost-effective, use less power, and are programmed to suit different design targets. Cloud companies will find new growth from hosting large numbers of SLMs. Smartphone makers will also benefit from SLMs.”
In China, meanwhile, DeepSeek’s impact has been remarkable, inspiring more Chinese firms to bring their own open-source LLMs to market. Chinese start-up Butterfly Effect, backed by tech giant Tencent Holdings, for example, released its Manus AI agent in March, which it claims performs more capably than OpenAI models.
As more such low-cost and highly efficient models are introduced, moreover, more real-world applications can benefit. In China, it is clear that firms are already adopting these SLMs in industries such as telecommunications, data centres and financial services. Taking the financial world as just one example, the technology is prompting an intensifying race among Chinese hedge funds to integrate such LLMs into their investment workflows, with fund managers such as Baiont Quant, WizardQuant and Mingshi Investment Management reportedly among dozens stepping up their AI-research capabilities.
The DeepSeek model is demonstrating that despite not having access to the world’s most sophisticated semiconductor technology, it can still develop competitive, game-changing AI-powered solutions. “We are in the eye of the storm” of an AI revolution, according to Feng Ji, chief executive of Baiont Quant, which uses machine learning to trade financial markets. “Two years ago, many fund managers would look at us AI-powered quants with mockery or disbelief,” Feng Ji told Reuters in mid-March. “Today, these sceptics could be out of business if they don’t embrace AI.”
Despite DeepSeek’s success in ushering in a potentially seismic shift when it comes to cost-optimised AI development, however, some see the premium research that dominates AI in the West continuing to occupy a sizeable presence in higher-end markets. Such research will still require copious amounts of computing resources, but it will remain important for generating breakthroughs in artificial general intelligence and complex multi-modal tasks.
According to Morgan Stanley, this means that a “dual-track” AI market will most likely emerge, pitting China’s efficiency-driven approach against the capital-intensive, high-performance AI models of the US and other advanced economies. “While China refines its ability to build powerful AI with limited resources, firms that are not limited by US chip restrictions will continue to push the boundaries of AI research through massive cloud investments,” the US bank’s investment-management unit stated in a March 2025 report.
“Rather than an ‘either-or’ scenario, we believe both models can thrive. Emerging market investors now have access to a diversified AI opportunity set—one that balances the high-cost, high-performance AI segment with China’s more accessible, cost-efficient solution,” Morgan Stanley added. “As Beijing intensifies its tech ambitions, DeepSeek represents more than just an AI milestone—it’s a symbol of China’s broader resurgence in innovation, investment and global competitiveness.”