DeepSeek Jolts AI Industry: Why AI's Next Leap May Not Come From More Data, But More Compute At Inference

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More

The AI landscape continues to evolve at a rapid pace, with recent developments challenging established paradigms. Early in 2025, Chinese AI lab DeepSeek unveiled a new model that sent shockwaves through the AI industry and resulted in a 17% drop in Nvidia’s stock, along with other stocks related to AI data center demand. This market reaction was widely reported to stem from DeepSeek’s apparent ability to deliver high-performance models at a fraction of the cost of rivals in the U.S., sparking discussion about the implications for AI data centers.

To contextualize DeepSeek’s disruption, we think it’s useful to consider a broader shift in the AI landscape being driven by the scarcity of additional training data. Because the major AI labs have now already trained their models on much of the available public data on the internet, data scarcity is slowing further improvements in pre-training. As a result, model providers are looking to “test-time compute” (TTC) where reasoning models (such as Open AI’s “o” series of models) “think” before responding to a question at inference time, as an alternative method to improve overall model performance. The current thinking is that TTC may exhibit scaling-law improvements similar to those that once propelled pre-training, potentially enabling the next wave of transformative AI advancements.

These developments indicate two significant shifts: First, labs operating on smaller (reported) budgets are now capable of releasing state-of-the-art models. The second shift is the focus on TTC as the next potential driver of AI progress. Below we unpack both of these trends and the potential implications for the competitive landscape and broader AI market.

Implications for the AI industry

We believe that the shift towards TTC and the increased competition among reasoning models may have a number of implications for the wider AI landscape across hardware, cloud platforms, foundation models and enterprise software.

1. Hardware (GPUs, dedicated chips and compute infrastructure)

From massive training clusters to on-demand “test-time” spikes: In our view, the shift towards TTC may have implications for the type of hardware resources that AI companies require and how they are managed. Rather than investing in increasingly larger GPU clusters dedicated to training workloads, AI companies may instead increase their investment in inference capabilities to support growing TTC needs. While AI companies will likely still require large numbers of GPUs to handle inference workloads, the differences between training workloads and inference workloads may impact how those chips are configured and used. Specifically, since inference workloads tend to be more dynamic (and “spikey”), capacity planning may become more complex than it is for batch-oriented training workloads.

Rise of inference-optimized hardware: We believe that the shift in focus towards TTC is likely to increase opportunities for alternative AI hardware that specializes in low-latency inference-time compute. For example, we may see more demand for GPU alternatives such as application specific integrated circuits (ASICs) for inference. As access to TTC becomes more important than training capacity, the dominance of general-purpose GPUs, which are used for both training and inference, may decline. This shift could benefit specialized inference chip providers.

2. Cloud platforms: Hyperscalers (AWS, Azure, GCP) and cloud compute

Quality of service (QoS) becomes a key differentiator: One issue preventing AI adoption in the enterprise, in addition to concerns around model accuracy, is the unreliability of inference APIs. Problems associated with unreliable API inference include fluctuating response times, rate limiting and difficulty handling concurrent requests and adapting to API endpoint changes. Increased TTC may further exacerbate these problems. In these circumstances, a cloud provider able to provide models with QoS assurances that address these challenges would, in our view, have a significant advantage.

Increased cloud spend despite efficiency gains: Rather than reducing demand for AI hardware, it is possible that more efficient approaches to large language model (LLM) training and inference may follow the Jevons Paradox, a historical observation where improved efficiency drives higher overall consumption. In this case, efficient inference models may encourage more AI developers to leverage reasoning models, which, in turn, increases demand for compute. We believe that recent model advances may lead to increased demand for cloud AI compute for both model inference and smaller, specialized model training.

3. Foundation model providers (OpenAI, Anthropic, Cohere, DeepSeek, Mistral)

Impact on pre-trained models: If new players like DeepSeek can compete with frontier AI labs at a fraction of the reported costs, proprietary pre-trained models may become less defensible as a moat. We can also expect further innovations in TTC for transformer models and, as DeepSeek has demonstrated, those innovations can come from sources outside of the more established AI labs.

4. Enterprise AI adoption and SaaS (application layer)

Security and privacy concerns: Given DeepSeek’s origins in China, there is likely to be ongoing scrutiny of the firm’s products from a security and privacy perspective. In particular, the firm’s China-based API and chatbot offerings are unlikely to be widely used by enterprise AI customers in the U.S., Canada or other Western countries. Many companies are reportedly moving to block the use of DeepSeek’s website and applications. We expect that DeepSeek’s models will face scrutiny even when they are hosted by third parties in the U.S. and other Western data centers which may limit enterprise adoption of the models. Researchers are already pointing to examples of security concerns around jail breaking, bias and harmful content generation. Given consumer attention, we may see experimentation and evaluation of DeepSeek’s models in the enterprise, but it is unlikely that enterprise buyers will move away from incumbents due to these concerns.

Vertical specialization gains traction: In the past, vertical applications that use foundation models mainly focused on creating workflows designed for specific business needs. Techniques such as retrieval-augmented generation (RAG), model routing, function calling and guardrails have played an important role in adapting generalized models for these specialized use cases. While these strategies have led to notable successes, there has been persistent concern that significant improvements to the underlying models could render these applications obsolete. As Sam Altman cautioned, a major breakthrough in model capabilities could “steamroll” application-layer innovations that are built as wrappers around foundation models.

However, if advancements in train-time compute are indeed plateauing, the threat of rapid displacement diminishes. In a world where gains in model performance come from TTC optimizations, new opportunities may open up for application-layer players. Innovations in domain-specific post-training algorithms — such as structured prompt optimization, latency-aware reasoning strategies and efficient sampling techniques — may provide significant performance improvements within targeted verticals.

Any performance improvement would be especially relevant in the context of reasoning-focused models like OpenAI’s GPT-4o and DeepSeek-R1, which often exhibit multi-second response times. In real-time applications, reducing latency and improving the quality of inference within a given domain could provide a competitive advantage. As a result, application-layer companies with domain expertise may play a pivotal role in optimizing inference efficiency and fine-tuning outputs.

DeepSeek demonstrates a declining emphasis on ever-increasing amounts of pre-training as the sole driver of model quality. Instead, the development underscores the growing importance of TTC. While the direct adoption of DeepSeek models in enterprise software applications remains uncertain due to ongoing scrutiny, their impact on driving improvements in other existing models is becoming clearer.

We believe that DeepSeek’s advancements have prompted established AI labs to incorporate similar techniques into their engineering and research processes, supplementing their existing hardware advantages. The resulting reduction in model costs, as predicted, appears to be contributing to increased model usage, aligning with the principles of Jevons Paradox.

Pashootan Vaezipoor is technical lead at Georgian.

Daily insights on business use cases with VB Daily

If you want to impress your boss, VB Daily has you covered. We give you the inside scoop on what companies are doing with generative AI, from regulatory shifts to practical deployments, so you can share insights for maximum ROI.

Read our Privacy Policy

Thanks for subscribing. Check out more VB newsletters here.

An error occured.

Source link

What's Hot

Tencent Hunyuan Releases and Open Sources Image Model 2.1, Supporting Native 2K High-Quality Images_the_model_This

Task orders and bottlenecks: how the largest US shipbuilder is putting AI to work

Does DINOv3 Set a New Medical Vision Standard? – Takara TLDR

DeepSeek jolts AI industry: Why AI’s next leap may not come from more data, but more compute at inference

UAE Lab Releases Open-Source Model to Rival China’s DeepSeek

Baidu updates AI reasoning model to rival systems from DeepSeek, OpenAI, Google

China’s DeepSeek Predicts XRP, Ethereum, Pi Prices by 2025

Leon Black and Leslie Wexner’s Letters to Jeffrey Epstein Released

School of Visual Arts Transfers Ownership to Nonprofit Alumni Society

Cristin Tierney Moves Gallery to Tribeca for 15th Anniversary Exhibition

Anne Imhof Reimagines Football Jerseys with Nike