The Chinese AI start-up DeepSeek has released V3.2-Exp, an experimental version of its language model, and at the same time reduced the prices for its API services by more than 50 percent. As the company announced on its Hugging Face page, the new version marks an intermediate step towards the next generation of AI architecture.
The company, which was only founded in 2023 and caused a stir in Silicon Valley earlier this year with its R1 model, says it is working with Chinese chip manufacturers on the further development of its models. The new V3.2-Exp version builds on the older V3.1 model and introduces a new technology called DeepSeek Sparse Attention (DSA).
Sparse Attention technology is designed to improve efficiency when processing long text sequences. While conventional attention mechanisms consider all tokens simultaneously for large language models, DSA only concentrates on the most relevant areas of the input. According to DeepSeek, this considerably reduces the computational effort without significantly affecting the quality of the output.
Alongside the model release, DeepSeek announced a drastic price reduction of more than 50 percent for its API services. The new rates apply immediately and should help the company to attract more users. In comparison, the previous V3.1 Terminus model will remain available via a temporary API until 15 October 2025.
Support from Huawei and new data formats
Huawei, the leading provider of AI chips in China, announced that its products will support the latest DeepSeek model.
DeepSeek has also stated that the latest versions of its models can handle simple 8-bit floating-point values (Floating Point 8, FP8), while work is underway to implement BF16 (Brain Floating Point 16). FP8 theoretically enables memory savings and faster calculations, as it requires less memory and the matrices are comparatively simple. Although FP8 is less precise than classic formats such as FP32, it is considered sufficiently accurate for AI applications.
BF16, on the other hand, represents a compromise between speed and precision. Support for both formats should make it possible to run large models on hardware with limited resources.
API prices reduced by 50 percent
With the price reduction of more than 50 percent, DeepSeek is positioning itself aggressively in the competitive AI API market. The company is thus joining a number of Chinese start-ups that want to gain market share through low prices. In the future, DeepSeek’s input tokens will cost USD 0.28 per million tokens instead of USD 0.56 previously. With Cache, the price will even fall to USD 0.028. One million output tokens will cost USD 0.42. There are reservations about Chinese models in terms of data protection and Chinese state censorship.
(mki)
Don’t miss any news – follow us on
Facebook,
LinkedIn or
Mastodon.
This article was originally published in
German.
It was translated with technical assistance and editorially reviewed before publication.
Dieser Link ist leider nicht mehr gültig.
Links zu verschenkten Artikeln werden ungültig,
wenn diese älter als 7 Tage sind oder zu oft aufgerufen wurden.
Sie benötigen ein heise+ Paket, um diesen Artikel zu lesen. Jetzt eine Woche unverbindlich testen – ohne Verpflichtung!