China’s top artificial intelligence company DeepSeek Ltd. has reportedly come unstuck in its efforts to develop its next-generation R2 reasoning model, because it cannot get its hands on enough of Nvidia Corp.’s graphics processing units, according to a report.
The Information cited two anonymous sources who are familiar with DeepSeek’s efforts as saying that the company has been working on the upcoming R2 model for several months, but its Chief Executive Liang Wengfeng is not yet satisfied with it. However, the company cannot improve its capabilities with the limited number of GPUs at its disposal.
DeepSeek shot to fame earlier this year when it debuted its original reasoning model R1, which proved to be more than a match for the most advanced models developed by U.S. companies like OpenAI, Anthropic PBC and Meta Platforms Inc., despite being built at a fraction of the cost.
According to The Information, DeepSeek trained R1 on a cluster of 50,000 Hopper GPUs, which included around 10,000 H100s, 10,000 H800s, and around 30,000 of the lower-powered H20 GPUs that were purpose-built for the Chinese market.
Chinese companies have never been able to purchase the H100 or H800 GPUs legally, and it’s thought that some of them were secretly supplied to DeepSeek by its investor High-Flyer Capital Management, while others were procured via shell companies that access public cloud infrastructure services. The H20 GPUs were obtained legally, but they have since become hard to come by due to new sanctions by the U.S. government that prohibit their export to China.
Part of the problem is that many of the H20 GPUs in China are already being used by DeepSeek’s customers. The Information says the R1 model has been widely adopted by Chinese companies and government agencies, and most of them run it on H20 GPUs in the cloud. So there’s no more capacity available for DeepSeek to train its latest model.
It’s said that the H20 GPU shortages are already causing problems with R1, limiting how it is used by Chinese firms. If the R2 model significantly improves on R1, it’s expected that the demand for the model will increase beyond what Chinese cloud infrastructure providers can handle, according to staff interviewed by The Information.
The H20 processor is comparable to the H100 GPU that Nvidia sells to western companies, but its bandwidth and connectivity had been throttled to meet earlier restrictions on the types of chips that could be exported to China. However, President Donald Trump’s administration decided that even this scaled-down chip is too powerful to be shipped to its geopolitical rival, and promptly slapped new restrictions on the country in April, banning its export there.
That decision has reportedly thrown a major spanner in the works of Chinese AI developers. While there are some domestic alternatives available, such as Huawei Technologies Co. Ltd.’s Ascend 910B chipset, these are even less powerful than the H20 and they lack support for Nvidia’s CUDA software stack – a programming architecture that’s used to optimize applications and AI models to run on Nvidia’s GPUs. That’s problematic because virtually all Chinese AI developers are thought to be using the CUDA software.
The Information says DeepSeek’s R1 and R2 models are also optimized for Nvidia’s chips, and its inability to access them could prove to be a major setback in its efforts to keep pace with its U.S. rivals.
Image: SiliconANGLE/Dreamina
Support our open free content by sharing and engaging with our content and community.
Join theCUBE Alumni Trust Network
Where Technology Leaders Connect, Share Intelligence & Create Opportunities
11.4k+
CUBE Alumni Network
C-level and Technical
Domain Experts
Connect with 11,413+ industry leaders from our network of tech and business leaders forming a unique trusted network effect.
SiliconANGLE Media is a recognized leader in digital media innovation serving innovative audiences and brands, bringing together cutting-edge technology, influential content, strategic insights and real-time audience engagement. As the parent company of SiliconANGLE, theCUBE Network, theCUBE Research, CUBE365, theCUBE AI and theCUBE SuperStudios — such as those established in Silicon Valley and the New York Stock Exchange (NYSE) — SiliconANGLE Media operates at the intersection of media, technology, and AI. .
Founded by tech visionaries John Furrier and Dave Vellante, SiliconANGLE Media has built a powerful ecosystem of industry-leading digital media brands, with a reach of 15+ million elite tech professionals. The company’s new, proprietary theCUBE AI Video cloud is breaking ground in audience interaction, leveraging theCUBEai.com neural network to help technology companies make data-driven decisions and stay at the forefront of industry conversations.