Close Menu
  • Home
  • AI Models
    • DeepSeek
    • xAI
    • OpenAI
    • Meta AI Llama
    • Google DeepMind
    • Amazon AWS AI
    • Microsoft AI
    • Anthropic (Claude)
    • NVIDIA AI
    • IBM WatsonX Granite 3.1
    • Adobe Sensi
    • Hugging Face
    • Alibaba Cloud (Qwen)
    • Baidu (ERNIE)
    • C3 AI
    • DataRobot
    • Mistral AI
    • Moonshot AI (Kimi)
    • Google Gemma
    • xAI
    • Stability AI
    • H20.ai
  • AI Research
    • Allen Institue for AI
    • arXiv AI
    • Berkeley AI Research
    • CMU AI
    • Google Research
    • Microsoft Research
    • Meta AI Research
    • OpenAI Research
    • Stanford HAI
    • MIT CSAIL
    • Harvard AI
  • AI Funding & Startups
    • AI Funding Database
    • CBInsights AI
    • Crunchbase AI
    • Data Robot Blog
    • TechCrunch AI
    • VentureBeat AI
    • The Information AI
    • Sifted AI
    • WIRED AI
    • Fortune AI
    • PitchBook
    • TechRepublic
    • SiliconANGLE – Big Data
    • MIT News
    • Data Robot Blog
  • Expert Insights & Videos
    • Google DeepMind
    • Lex Fridman
    • Matt Wolfe AI
    • Yannic Kilcher
    • Two Minute Papers
    • AI Explained
    • TheAIEdge
    • Matt Wolfe AI
    • The TechLead
    • Andrew Ng
    • OpenAI
  • Expert Blogs
    • François Chollet
    • Gary Marcus
    • IBM
    • Jack Clark
    • Jeremy Howard
    • Melanie Mitchell
    • Andrew Ng
    • Andrej Karpathy
    • Sebastian Ruder
    • Rachel Thomas
    • IBM
  • AI Policy & Ethics
    • ACLU AI
    • AI Now Institute
    • Center for AI Safety
    • EFF AI
    • European Commission AI
    • Partnership on AI
    • Stanford HAI Policy
    • Mozilla Foundation AI
    • Future of Life Institute
    • Center for AI Safety
    • World Economic Forum AI
  • AI Tools & Product Releases
    • AI Assistants
    • AI for Recruitment
    • AI Search
    • Coding Assistants
    • Customer Service AI
    • Image Generation
    • Video Generation
    • Writing Tools
    • AI for Recruitment
    • Voice/Audio Generation
  • Industry Applications
    • Finance AI
    • Healthcare AI
    • Legal AI
    • Manufacturing AI
    • Media & Entertainment
    • Transportation AI
    • Education AI
    • Retail AI
    • Agriculture AI
    • Energy AI
  • AI Art & Entertainment
    • AI Art News Blog
    • Artvy Blog » AI Art Blog
    • Weird Wonderful AI Art Blog
    • The Chainsaw » AI Art
    • Artvy Blog » AI Art Blog
What's Hot

Cisco Unveils Foundation AI for Enhanced Security Integration

Study: AI-Powered Research Prowess Now Outstrips Human Experts, Raising Bioweapon Risks

C3.ai Stock Dips Following Palantir Technologies Earnings: What’s Going On? – C3.ai (NYSE:AI)

Facebook X (Twitter) Instagram
Advanced AI News
  • Home
  • AI Models
    • Adobe Sensi
    • Aleph Alpha
    • Alibaba Cloud (Qwen)
    • Amazon AWS AI
    • Anthropic (Claude)
    • Apple Core ML
    • Baidu (ERNIE)
    • ByteDance Doubao
    • C3 AI
    • Cohere
    • DataRobot
    • DeepSeek
  • AI Research & Breakthroughs
    • Allen Institue for AI
    • arXiv AI
    • Berkeley AI Research
    • CMU AI
    • Google Research
    • Meta AI Research
    • Microsoft Research
    • OpenAI Research
    • Stanford HAI
    • MIT CSAIL
    • Harvard AI
  • AI Funding & Startups
    • AI Funding Database
    • CBInsights AI
    • Crunchbase AI
    • Data Robot Blog
    • TechCrunch AI
    • VentureBeat AI
    • The Information AI
    • Sifted AI
    • WIRED AI
    • Fortune AI
    • PitchBook
    • TechRepublic
    • SiliconANGLE – Big Data
    • MIT News
    • Data Robot Blog
  • Expert Insights & Videos
    • Google DeepMind
    • Lex Fridman
    • Meta AI Llama
    • Yannic Kilcher
    • Two Minute Papers
    • AI Explained
    • TheAIEdge
    • Matt Wolfe AI
    • The TechLead
    • Andrew Ng
    • OpenAI
  • Expert Blogs
    • François Chollet
    • Gary Marcus
    • IBM
    • Jack Clark
    • Jeremy Howard
    • Melanie Mitchell
    • Andrew Ng
    • Andrej Karpathy
    • Sebastian Ruder
    • Rachel Thomas
    • IBM
  • AI Policy & Ethics
    • ACLU AI
    • AI Now Institute
    • Center for AI Safety
    • EFF AI
    • European Commission AI
    • Partnership on AI
    • Stanford HAI Policy
    • Mozilla Foundation AI
    • Future of Life Institute
    • Center for AI Safety
    • World Economic Forum AI
  • AI Tools & Product Releases
    • AI Assistants
    • AI for Recruitment
    • AI Search
    • Coding Assistants
    • Customer Service AI
    • Image Generation
    • Video Generation
    • Writing Tools
    • AI for Recruitment
    • Voice/Audio Generation
  • Industry Applications
    • Education AI
    • Energy AI
    • Finance AI
    • Healthcare AI
    • Legal AI
    • Media & Entertainment
    • Transportation AI
    • Manufacturing AI
    • Retail AI
    • Agriculture AI
  • AI Art & Entertainment
    • AI Art News Blog
    • Artvy Blog » AI Art Blog
    • Weird Wonderful AI Art Blog
    • The Chainsaw » AI Art
    • Artvy Blog » AI Art Blog
Advanced AI News
Home » Smarter Software Vs. More Compute
DeepSeek

Smarter Software Vs. More Compute

Advanced AI BotBy Advanced AI BotMay 7, 2025No Comments6 Mins Read
Share Facebook Twitter Pinterest Copy Link Telegram LinkedIn Tumblr Email
Share
Facebook Twitter LinkedIn Pinterest Email


Daniel A. Keller, CEO and President of InFlux Technologies Limited. Cofounder of Flux.

Getty Images

When ChatGPT was released by OpenAI in 2022, it was the peak expression of AI chatbots built on large language models (LLMs). With an accessible interface and absolutely no need for external gadgets, it was the power of interactive AI in the palms of users, literally!

Barely five days after its launch, ChatGPT broke the 1 million download milestone. (For context, that took Facebook 10 months to achieve.) Of course, there were a few problems, like the occasional lags and hallucinations, but version after version, ChatGPT continued to expand its frontiers.

There were also apprehensions about the development cost of ChatGPT-4, somewhere between $48 to $71 million. But it was all completely justifiable. Sixteen thousand H100s GPUs don’t come cheap, and salaries have to be paid.

Or was it?

Rise Of The Deep

On January 20, 2025, the world woke up to news that would change the trajectory of AI technology. A little-known Chinese company had launched DeepSeek R1, an AI with capabilities comparable to OpenAI’s ChatGPT.

And the shocker?

The initial reports claimed it did it with fewer, cheaper and older GPUs at a development cost of only $5.6 million. The ripple effect sent shock waves across the markets. By Monday, Nvidia, the biggest supplier of AI GPU chips, lost almost $600 billion in market value as investors started reconsidering their options. Indexes and corporations like Nasdaq, Microsoft and Alphabet also plummeted. Within a week, Deepseek had overtaken ChatGPT to become the most downloaded application on the Apple App Store.

But since then, DeepSeek has come under scrutiny, with the head of Google’s DeepMind calling its claims “exaggerated” and one critic suggesting it actually cost DeepSeek over $1 billion to create its AI model.

Nevertheless, DeepSeek’s arrival has caused a shift. The investment rationale for the supply chain had been quite simple: more spending and better outcomes for AI.

Until now.

The Paradigm Shift

Deepseek’s story is exceptional for several reasons. First, due to the United States’ efforts to stem the flow of advanced AI technology to competing nations, the Biden administration restricted the export of GPUs to China, limiting the availability of advanced AI GPUs like the A100s and the H100s. As a result, Deepseek presumably had to rely on less sophisticated but more available GPUs like the H800.

The ability of Deepseek to turn this crippling limitation into one of the marvels of AI innovation highlights a very critical question: Is ingenuity and better software architecture a more sustainable alternative to advanced but expensive GPUs?

GPU availability (significantly advanced chips like the H100s) is one of the rate-limiting steps for AI research and development; even in the U.S., Nvidia, the top producer of GPUs globally, continues to grapple with meeting its high demand. A breakthrough that demonstrates that companies and research labs can maximize their computing power and cut down costs is a game-changer for the entire industry, but how exactly did DeepSeek achieve this?

Flipping The Game

Before Deepseek’s emergence in AI, it had always been a game of who was bigger. Bigger financial investments translate into bigger LLM Models, which in turn require more compute resources and, hopefully, bigger innovative strides.

However, DeepSeek’s approach was counterintuitive. Instead of slapping on more compute and developing bigger models, the Chinese company focused on optimizing for a more efficient use of available resources. This included enhancing its model abilities through reinforcement learning, leveraging improved software architecture and optimizing its algorithm.

Rather than dwarfing prevailing challenges with sheer brute power, Deepseek turned the game on its head. Early benchmarks showed it was 20 times more efficient and far less compute-intensive than its more pronounced competitors.

Since it relied on reinforcement learning, Deepseek-R1 also eliminated the need for large teams of human reviewers and supervised fine-tuning, keeping operating costs to a minimum.

Another important paradigm that Deepseek adopted was its incorporation of MOE (mixture of experts) architecture. MOE leverages multiple expert sub-models and uses selective gating to activate only the most relevant parameters for each input. For context, the Deepseek MoE framework comprises around 671 billion parameters; however, less than 0.5% of these parameters are used during any input.

Picture a diverse team of seasoned experts across different disciplines. When needed, the gating mechanism dynamically selects the best combination of experts to solve the problem.

The result?

Dynamic routing and allocation lowers the amount of computation the model requires by reducing unnecessary computation. This approach also improves efficiency, promotes seamless scalability and supports progressive fine-tuning of different expert system components for specific problems.

Implications For The Broader AI Industry

Compute-efficient AI solutions encourage democratization, allowing for dynamic innovations from different quarters. This could, in turn, promote cheaper access to AI resources, breaking Big Tech’s monopoly on AI innovation.

Deepseek’s open-source nature provides a level playing field for researchers to engage in deep R&D without breaking the bank. Its lower energy requirements and smaller carbon footprint can also positively drive environmentally sustainable designs for data centers in the near future.

However, as revolutionary as the emergence of Deepseek has been, there are also a few drawbacks (on top of the dubiousness of its claims).

First, while DeepSeek’s open-source nature encourages technology sharing and participation, it also means malicious actors can repurpose it, raising fresh concerns about heightened misinformation, deepfakes and other sinister possibilities.

Another danger hinges on data sovereignty and the possibility of the Chinese government mining users’ data.

Rounding Off

While DeepSeek has demonstrated capabilities that are comparable to OpenAI ChatGPT in many ways, its long-term effect on repositioning AI technology, compute and market dynamics still remains to be seen.

Whatever the future might hold, Deepseek’s successful deployment of a powerful open-source model has introduced a new level playing field for innovation in the AI industry. As this distills into the mainstream, its ripple effect could determine the face of the next iteration of artificial intelligence.

Forbes Technology Council is an invitation-only community for world-class CIOs, CTOs and technology executives. Do I qualify?



Source link

Follow on Google News Follow on Flipboard
Share. Facebook Twitter Pinterest LinkedIn Tumblr Email Copy Link
Previous ArticleQwen 3 Open Source Hybrid AI Beats Deepseek R1 : Performance Fully Tested
Next Article Mistral AI adds Medium 3 to its family of models, claiming low cost and high performance
Advanced AI Bot
  • Website

Related Posts

DeepSeek-GRM: Revolutionizing Scalable, Cost-Efficient AI for Businesses

May 8, 2025

Is Apple Intelligence Integrating Gemini, Claude, DeepSeek and Grok?

May 8, 2025

DeepSeek-GRM: Revolutionizing Scalable, Cost-Efficient AI for Businesses

May 8, 2025
Leave A Reply Cancel Reply

Latest Posts

Beyond ‘Love,’ The Enduring Legacy Of Robert Indiana Resonates Deeply Through Pace Gallery Representation

Ancient Greek Author and Title of Charred Herculaneum Scroll Revealed

Bonhams To Auction Museum Quality Work from The Holly Solomon Collection.

Justin Bateman Turns Stones Into Ephemeral Art

Latest Posts

Cisco Unveils Foundation AI for Enhanced Security Integration

May 8, 2025

Study: AI-Powered Research Prowess Now Outstrips Human Experts, Raising Bioweapon Risks

May 8, 2025

C3.ai Stock Dips Following Palantir Technologies Earnings: What’s Going On? – C3.ai (NYSE:AI)

May 8, 2025

Subscribe to News

Subscribe to our newsletter and never miss our latest news

Subscribe my Newsletter for New Posts & tips Let's stay updated!

Welcome to Advanced AI News—your ultimate destination for the latest advancements, insights, and breakthroughs in artificial intelligence.

At Advanced AI News, we are passionate about keeping you informed on the cutting edge of AI technology, from groundbreaking research to emerging startups, expert insights, and real-world applications. Our mission is to deliver high-quality, up-to-date, and insightful content that empowers AI enthusiasts, professionals, and businesses to stay ahead in this fast-evolving field.

Subscribe to Updates

Subscribe to our newsletter and never miss our latest news

Subscribe my Newsletter for New Posts & tips Let's stay updated!

YouTube LinkedIn
  • Home
  • About Us
  • Advertise With Us
  • Contact Us
  • DMCA
  • Privacy Policy
  • Terms & Conditions
© 2025 advancedainews. Designed by advancedainews.

Type above and press Enter to search. Press Esc to cancel.