Close Menu
  • Home
  • AI Models
    • DeepSeek
    • xAI
    • OpenAI
    • Meta AI Llama
    • Google DeepMind
    • Amazon AWS AI
    • Microsoft AI
    • Anthropic (Claude)
    • NVIDIA AI
    • IBM WatsonX Granite 3.1
    • Adobe Sensi
    • Hugging Face
    • Alibaba Cloud (Qwen)
    • Baidu (ERNIE)
    • C3 AI
    • DataRobot
    • Mistral AI
    • Moonshot AI (Kimi)
    • Google Gemma
    • xAI
    • Stability AI
    • H20.ai
  • AI Research
    • Allen Institue for AI
    • arXiv AI
    • Berkeley AI Research
    • CMU AI
    • Google Research
    • Microsoft Research
    • Meta AI Research
    • OpenAI Research
    • Stanford HAI
    • MIT CSAIL
    • Harvard AI
  • AI Funding & Startups
    • AI Funding Database
    • CBInsights AI
    • Crunchbase AI
    • Data Robot Blog
    • TechCrunch AI
    • VentureBeat AI
    • The Information AI
    • Sifted AI
    • WIRED AI
    • Fortune AI
    • PitchBook
    • TechRepublic
    • SiliconANGLE – Big Data
    • MIT News
    • Data Robot Blog
  • Expert Insights & Videos
    • Google DeepMind
    • Lex Fridman
    • Matt Wolfe AI
    • Yannic Kilcher
    • Two Minute Papers
    • AI Explained
    • TheAIEdge
    • Matt Wolfe AI
    • The TechLead
    • Andrew Ng
    • OpenAI
  • Expert Blogs
    • François Chollet
    • Gary Marcus
    • IBM
    • Jack Clark
    • Jeremy Howard
    • Melanie Mitchell
    • Andrew Ng
    • Andrej Karpathy
    • Sebastian Ruder
    • Rachel Thomas
    • IBM
  • AI Policy & Ethics
    • ACLU AI
    • AI Now Institute
    • Center for AI Safety
    • EFF AI
    • European Commission AI
    • Partnership on AI
    • Stanford HAI Policy
    • Mozilla Foundation AI
    • Future of Life Institute
    • Center for AI Safety
    • World Economic Forum AI
  • AI Tools & Product Releases
    • AI Assistants
    • AI for Recruitment
    • AI Search
    • Coding Assistants
    • Customer Service AI
    • Image Generation
    • Video Generation
    • Writing Tools
    • AI for Recruitment
    • Voice/Audio Generation
  • Industry Applications
    • Finance AI
    • Healthcare AI
    • Legal AI
    • Manufacturing AI
    • Media & Entertainment
    • Transportation AI
    • Education AI
    • Retail AI
    • Agriculture AI
    • Energy AI
  • AI Art & Entertainment
    • AI Art News Blog
    • Artvy Blog » AI Art Blog
    • Weird Wonderful AI Art Blog
    • The Chainsaw » AI Art
    • Artvy Blog » AI Art Blog
What's Hot

Last day to amplify your brand: Host your Side Event at Disrupt 2025

Education report calling for ethical AI use contains over 15 fake sources

Britannica Group sues Perplexity AI over online summaries

Facebook X (Twitter) Instagram
Advanced AI News
  • Home
  • AI Models
    • OpenAI (GPT-4 / GPT-4o)
    • Anthropic (Claude 3)
    • Google DeepMind (Gemini)
    • Meta (LLaMA)
    • Cohere (Command R)
    • Amazon (Titan)
    • IBM (Watsonx)
    • Inflection AI (Pi)
  • AI Research
    • Allen Institue for AI
    • arXiv AI
    • Berkeley AI Research
    • CMU AI
    • Google Research
    • Meta AI Research
    • Microsoft Research
    • OpenAI Research
    • Stanford HAI
    • MIT CSAIL
    • Harvard AI
  • AI Funding
    • AI Funding Database
    • CBInsights AI
    • Crunchbase AI
    • Data Robot Blog
    • TechCrunch AI
    • VentureBeat AI
    • The Information AI
    • Sifted AI
    • WIRED AI
    • Fortune AI
    • PitchBook
    • TechRepublic
    • SiliconANGLE – Big Data
    • MIT News
    • Data Robot Blog
  • AI Experts
    • Google DeepMind
    • Lex Fridman
    • Meta AI Llama
    • Yannic Kilcher
    • Two Minute Papers
    • AI Explained
    • TheAIEdge
    • The TechLead
    • Matt Wolfe AI
    • Andrew Ng
    • OpenAI
    • Expert Blogs
      • François Chollet
      • Gary Marcus
      • IBM
      • Jack Clark
      • Jeremy Howard
      • Melanie Mitchell
      • Andrew Ng
      • Andrej Karpathy
      • Sebastian Ruder
      • Rachel Thomas
      • IBM
  • AI Tools
    • AI Assistants
    • AI for Recruitment
    • AI Search
    • Coding Assistants
    • Customer Service AI
  • AI Policy
    • ACLU AI
    • AI Now Institute
    • Center for AI Safety
  • Business AI
    • Advanced AI News Features
    • Finance AI
    • Healthcare AI
    • Education AI
    • Energy AI
    • Legal AI
LinkedIn Instagram YouTube Threads X (Twitter)
Advanced AI News
Alibaba Cloud (Qwen)

80B Large Model Performance Soars, Inference Costs Plummet, A New Paradigm for AI Models?_the_brings_and

By Advanced AI EditorSeptember 12, 2025No Comments3 Mins Read
Share Facebook Twitter Pinterest Copy Link Telegram LinkedIn Tumblr Email
Share
Facebook Twitter LinkedIn Pinterest Email


QbitAI has learned that the Qwen teamhas released its next-generation model architecture—Qwen3-Next. This update brings a preview version of Qwen3.5 and open-sources the Qwen3-Next-80B-A3B-Base model. This model has achieved significant performance improvements while drastically reducing inference costs, indicating a new trend in the development of large models.

Innovative Hybrid Architecture: GatedDeltaNet and Hybrid Attention Mechanism

One of the core improvements of Qwen3-Next is its innovative hybrid attention mechanism. To address the limitations of linear attention in processing long contexts and the high overhead of standard attention calculations, the Qwen team introduced GatedDeltaNet. GatedDeltaNet excels in contextual learning capability and employs a 3:1 hybrid strategy (75% layers using GatedDeltaNet and 25% layers retaining standard attention), balancing performance and efficiency. Within the standard attention layers, the team further optimized the output gating mechanism, expanded the attention head dimensions, and introduced rotary position encoding to enhance long-sequence extrapolation capabilities.

High Sparsity MoE Architecture and Training Optimization

Qwen3-Next adopts a highly sparse MoE architecture, with a total parameter count of 80 billion, but only about 3 billion parameters are activated during each inference. This design maximizes resource utilization while ensuring performance. Additionally, the team employed Zero-Centered RMSNorm and applied weight decay to norm weights to enhance model stability. By initializing the parameters of the MoE router, it ensures that each expert can be selected unbiasedly in the early stages of training, reducing the impact of initialization on experimental results. These optimizations aim to improve the stability and efficiency of model training.

Multi-Token Prediction Mechanism and Performance Leap

Qwen3-Next introduces a native Multi-Token Prediction (MTP) mechanism, which not only enhances the overall performance of the model backbone but also improves the acceptance rate of Speculative Decoding through specialized optimizations. Thanks to these innovations, Qwen3-Next has achieved significant performance improvements. With only 15T tokens of pre-training corpus, the GPU hours required for training are less than 80% of that of Qwen3-30A-3B. Compared to Qwen3-32B, Qwen3-Next-80B-A3B has nearly 7 times the throughput during the pre-filling phase and more than 10 times in contexts longer than 32k. During the decoding phase, the throughput for 4k context improves by about 4 times, maintaining over 10 times the throughput advantage in long-context scenarios. Based on Qwen3-Next, the Qwen team also released Qwen3-Next-80B-A3B-Instruct and Qwen3-Next-80B-A3B-Thinking, both demonstrating excellent performance across multiple benchmark tests, even surpassing the closed-source model Gemini-2.5-Flash-Thinking.

Measured Performance: AIME Competition Problems and Programming Applications

In practical applications, Qwen3-Next-80B-A3B exhibits strong reasoning capabilities. On the QwenChat webpage, the model almost instantly solved AIME math competition problems, providing detailed problem-solving thoughts and answers. In programming, the model was able to generate p5js code for a Minesweeper game. These practical results fully demonstrate the outstanding performance of Qwen3-Next across different tasks. The rapid advancements in the field of artificial intelligenceundoubtedly bring new vitality to the industry with the release of Qwen3-Next.

Cost-Effectiveness and Future Outlook

While improving performance, Qwen3-Next has also significantly reduced training costs. According to official data, the training cost of Qwen3-Next-80B-A3B is only one-tenth of that of Qwen3-32B. This enhancement in cost-effectivenessis expected to drive the application of AI technologyin more fields. In the future, as technology continues to advance, we have reason to believe that large models will achieve greater breakthroughs in performance, efficiency, and cost. What technological innovations do you think will play a key role in the future development of large models?

返回搜狐,查看更多

平台声明:该文观点仅代表作者本人,搜狐号系信息发布平台,搜狐仅提供信息存储空间服务。



Source link

Follow on Google News Follow on Flipboard
Share. Facebook Twitter Pinterest LinkedIn Tumblr Email Copy Link
Previous ArticleHow do AI models generate videos?
Next Article 2D Gaussian Splatting with Semantic Alignment for Image Inpainting – Takara TLDR
Advanced AI Editor
  • Website

Related Posts

Alibaba’s $3.2B AI Gamble Just Lit a Fire Under the Stock

September 12, 2025

Alibaba Cloud Releases the Qwen3-Next Base Model Architecture and Open Sources the 80B-A3B Series_model_this_two

September 12, 2025

Alibaba Unveils Trillion-Parameter Qwen AI Model

September 11, 2025

Comments are closed.

Latest Posts

Nicholas Galanin Pulls Out of Smithsonian Event, Claiming Censorship

Long-Lost Painting By Rubens From 1613 Discovered in Paris Mansion

Ken Griffin Loves Pollock’s Blue Poles So Much He Tried to Buy it

Nan Goldin Says Her Market ‘Tanked’ Due to Palestine Activism

Latest Posts

Last day to amplify your brand: Host your Side Event at Disrupt 2025

September 12, 2025

Education report calling for ethical AI use contains over 15 fake sources

September 12, 2025

Britannica Group sues Perplexity AI over online summaries

September 12, 2025

Subscribe to News

Subscribe to our newsletter and never miss our latest news

Subscribe my Newsletter for New Posts & tips Let's stay updated!

Recent Posts

  • Last day to amplify your brand: Host your Side Event at Disrupt 2025
  • Education report calling for ethical AI use contains over 15 fake sources
  • Britannica Group sues Perplexity AI over online summaries
  • Nicholas Galanin Pulls Out of Smithsonian Event, Claiming Censorship
  • Citi snags AI head from IBM

Recent Comments

  1. バイナリーオプション おすすめ on 1-800-CHAT-GPT—12 Days of OpenAI: Day 10
  2. WalterSkete on New MIT CSAIL study suggests that AI won’t steal as many jobs as expected
  3. CarlosDum on 1-800-CHAT-GPT—12 Days of OpenAI: Day 10
  4. GeraldPaf on 1-800-CHAT-GPT—12 Days of OpenAI: Day 10
  5. HoraceVot on 1-800-CHAT-GPT—12 Days of OpenAI: Day 10

Welcome to Advanced AI News—your ultimate destination for the latest advancements, insights, and breakthroughs in artificial intelligence.

At Advanced AI News, we are passionate about keeping you informed on the cutting edge of AI technology, from groundbreaking research to emerging startups, expert insights, and real-world applications. Our mission is to deliver high-quality, up-to-date, and insightful content that empowers AI enthusiasts, professionals, and businesses to stay ahead in this fast-evolving field.

Subscribe to Updates

Subscribe to our newsletter and never miss our latest news

Subscribe my Newsletter for New Posts & tips Let's stay updated!

LinkedIn Instagram YouTube Threads X (Twitter)
  • Home
  • About Us
  • Advertise With Us
  • Contact Us
  • DMCA
  • Privacy Policy
  • Terms & Conditions
© 2025 advancedainews. Designed by advancedainews.

Type above and press Enter to search. Press Esc to cancel.