Huawei Claims Better AI Training Method Than DeepSeek Using Own Ascend Chips

Researchers working on Huawei Technologies’ large language model (LLM) Pangu claimed they have improved on DeepSeek’s original approach to training artificial intelligence (AI) by leveraging the US-sanctioned company’s proprietary hardware.

A paper – published last week by Huawei’s Pangu team, which comprises 22 core contributors and 56 additional researchers – introduced the concept of Mixture of Grouped Experts (MoGE). It is an upgraded version of the Mixture of Experts (MoE) technique that has been instrumental in DeepSeek’s cost-effective AI models.

While MoE offers low execution costs for large model parameters and enhanced learning capacity, it often results in inefficiencies, according to the paper. This is because of the uneven activation of so-called experts, which can hinder performance when running on multiple devices in parallel.

In contrast, the improved MoGE “groups the experts during selection and better balances the expert workload”, researchers said.

In AI training, “experts” refer to specialised sub-models or components within a larger model, each designed to handle specific tasks or types of data. This allows the overall system to take advantage of diverse expertise to enhance performance.

01:38

China a ‘key market’, says Nvidia CEO Huang during Beijing visit as US bans AI chips

The advancement comes at a crucial time, as Chinese AI companies are focused on enhancing model training and inference efficiency through algorithmic improvements and a synergy of hardware and software, despite US restrictions on the export of advanced AI chips like those from Nvidia.

Source link

What's Hot

RenderFormer: How neural networks are reshaping 3D rendering

RSS co-creator launches new protocol for AI data licensing

Google Unveils New AI Marketing Tools Ahead of Holiday Season

Huawei claims better AI training method than DeepSeek using own Ascend chips

UAE launches an AI model to take on DeepSeek, OpenAI

UAE Lab Releases Open-Source Model to Rival China’s DeepSeek

Baidu updates AI reasoning model to rival systems from DeepSeek, OpenAI, Google

Growing Support for Parthenon Marbles’ Return to Greece, More Art News

Leon Black and Leslie Wexner’s Letters to Jeffrey Epstein Released

School of Visual Arts Transfers Ownership to Nonprofit Alumni Society

Cristin Tierney Moves Gallery to Tribeca for 15th Anniversary Exhibition

RenderFormer: How neural networks are reshaping 3D rendering

RSS co-creator launches new protocol for AI data licensing

Google Unveils New AI Marketing Tools Ahead of Holiday Season

What's Hot

Huawei claims better AI training method than DeepSeek using own Ascend chips

Related Posts

Subscribe to Updates