China’s DeepSeek Challenges US AI Costs With Low-Cost Training Model - Space/Science News

The disclosure appeared in a peer-reviewed article published Wednesday in Nature, marking the first time the Hangzhou-based company revealed details of its training costs.

DeepSeek’s release of lower-cost AI systems earlier this year unsettled global tech markets, with investors fearing the models could erode the position of US giants such as Nvidia.

The Nature article, co-authored by founder Liang Wenfeng, said the R1 was trained using 512 Nvidia H800 chips and took 80 hours to complete. A previous January version of the paper omitted cost details.

Training large-language models typically requires weeks of computation on powerful processors, often costing tens or even hundreds of millions of dollars. OpenAI chief executive Sam Altman said in 2023 that foundational model training had cost “much more” than $100 million, without providing specifics.

Washington has questioned DeepSeek’s claims. US officials told Reuters in June the company held “large volumes” of Nvidia’s high-end H100 chips despite American export bans. Nvidia said DeepSeek lawfully used H800 chips, while DeepSeek acknowledged for the first time that it also possessed A100 chips, employed in preliminary development stages.

DeepSeek’s access to advanced processors has helped it attract leading Chinese researchers, Reuters has previously reported.

The company also addressed allegations it had copied OpenAI’s models. US officials and industry figures suggested in January that DeepSeek “distilled” OpenAI’s technology into its own.

DeepSeek defended the practice, saying distillation improves performance and reduces costs, making AI more accessible. The method allows one AI to learn from another’s outputs, leveraging prior investment while cutting expenses.

The firm acknowledged using Meta’s open-source Llama for some versions of its models. It also noted that training data for its V3 model included web content containing OpenAI-generated answers, but said this was incidental rather than deliberate.

OpenAI did not respond to Reuters’ request for comment.

Source link

What's Hot

Don’t Leave America! Microsoft, JP Morgan, Amazon, IBM, and Apple Caution H1B & H4 Techies

Lincoln Center’s Collider Fellows explore how tech could transform the performing arts

Assessing Valuation as Leadership Changes, Revenue Slide, and Lawsuits Shake Investor Confidence

China’s DeepSeek Challenges US AI Costs with Low-Cost Training Model – Space/Science news

In Other News: 600k Hit by Healthcare Breaches, Major ShinyHunters Hacks, DeepSeek’s Coding Bias

Huawei co-develops safety-focused DeepSeek model to block politically sensitive topics

China’s DeepSeek shook the tech world. Its developer just revealed the cost of training the AI model

Acquavella Signs Harumi Klossowska de Rola, Daughter of Balthus

Heirs of Jewish Collector Urge Court to Reconsider Claim to Sunflowers

Art World Figures Remember Agnes Gund: ‘a Legend and Icon’

Bizarre Trump Bitcoin Statue Appears in Washington, D.C.

Don’t Leave America! Microsoft, JP Morgan, Amazon, IBM, and Apple Caution H1B & H4 Techies

Lincoln Center’s Collider Fellows explore how tech could transform the performing arts

Assessing Valuation as Leadership Changes, Revenue Slide, and Lawsuits Shake Investor Confidence

What's Hot

China’s DeepSeek Challenges US AI Costs with Low-Cost Training Model – Space/Science news

Related Posts

Subscribe to Updates