Huawei Claims Better AI Training Method Than DeepSeek Using Own Ascend Chips

Researchers working on Huawei Technologies’ large language model (LLM) Pangu claimed they have improved on DeepSeek’s original approach to training artificial intelligence (AI) by leveraging the US-sanctioned company’s proprietary hardware.

A paper – published last week by Huawei’s Pangu team, which comprises 22 core contributors and 56 additional researchers – introduced the concept of Mixture of Grouped Experts (MoGE). It is an upgraded version of the Mixture of Experts (MoE) technique that has been instrumental in DeepSeek’s cost-effective AI models.

While MoE offers low execution costs for large model parameters and enhanced learning capacity, it often results in inefficiencies, according to the paper. This is because of the uneven activation of so-called experts, which can hinder performance when running on multiple devices in parallel.

In contrast, the improved MoGE “groups the experts during selection and better balances the expert workload”, researchers said.

In AI training, “experts” refer to specialised sub-models or components within a larger model, each designed to handle specific tasks or types of data. This allows the overall system to take advantage of diverse expertise to enhance performance.

01:38

China a ‘key market’, says Nvidia CEO Huang during Beijing visit as US bans AI chips

The advancement comes at a crucial time, as Chinese AI companies are focused on enhancing model training and inference efficiency through algorithmic improvements and a synergy of hardware and software, despite US restrictions on the export of advanced AI chips like those from Nvidia.

Source link

What's Hot

TCS integrates NVIDIA AI Enterprise into retail solutions

Meta Teams Up With US Government To Bring Llama AI Models To Every Federal Agency – Meta Platforms (NASDAQ:META)

Google DeepMind Updates AI Safety Rules to Counter ‘Harmful Manipulation’ and Models That Resist Shutdown

Huawei claims better AI training method than DeepSeek using own Ascend chips

DeepSeek reports shockingly low training costs for R1 in new paper

DeepSeek warns its open-source AI models are vulnerable to ‘jailbreaking’

DeepSeek R1 is now a peer-reviewed AI model

New Collectors Drive Strong Sales at New York Fair

Hidden Portrait May Be Vermeer’s Earliest Known Work

Who Are the Art World Figures on the Time 100 List?

Acquavella Signs Harumi Klossowska de Rola, Daughter of Balthus

TCS integrates NVIDIA AI Enterprise into retail solutions

Meta Teams Up With US Government To Bring Llama AI Models To Every Federal Agency – Meta Platforms (NASDAQ:META)

Google DeepMind Updates AI Safety Rules to Counter ‘Harmful Manipulation’ and Models That Resist Shutdown

What's Hot

Huawei claims better AI training method than DeepSeek using own Ascend chips

Related Posts

Subscribe to Updates