Close Menu
  • Home
  • AI Models
    • DeepSeek
    • xAI
    • OpenAI
    • Meta AI Llama
    • Google DeepMind
    • Amazon AWS AI
    • Microsoft AI
    • Anthropic (Claude)
    • NVIDIA AI
    • IBM WatsonX Granite 3.1
    • Adobe Sensi
    • Hugging Face
    • Alibaba Cloud (Qwen)
    • Baidu (ERNIE)
    • C3 AI
    • DataRobot
    • Mistral AI
    • Moonshot AI (Kimi)
    • Google Gemma
    • xAI
    • Stability AI
    • H20.ai
  • AI Research
    • Allen Institue for AI
    • arXiv AI
    • Berkeley AI Research
    • CMU AI
    • Google Research
    • Microsoft Research
    • Meta AI Research
    • OpenAI Research
    • Stanford HAI
    • MIT CSAIL
    • Harvard AI
  • AI Funding & Startups
    • AI Funding Database
    • CBInsights AI
    • Crunchbase AI
    • Data Robot Blog
    • TechCrunch AI
    • VentureBeat AI
    • The Information AI
    • Sifted AI
    • WIRED AI
    • Fortune AI
    • PitchBook
    • TechRepublic
    • SiliconANGLE – Big Data
    • MIT News
    • Data Robot Blog
  • Expert Insights & Videos
    • Google DeepMind
    • Lex Fridman
    • Matt Wolfe AI
    • Yannic Kilcher
    • Two Minute Papers
    • AI Explained
    • TheAIEdge
    • Matt Wolfe AI
    • The TechLead
    • Andrew Ng
    • OpenAI
  • Expert Blogs
    • François Chollet
    • Gary Marcus
    • IBM
    • Jack Clark
    • Jeremy Howard
    • Melanie Mitchell
    • Andrew Ng
    • Andrej Karpathy
    • Sebastian Ruder
    • Rachel Thomas
    • IBM
  • AI Policy & Ethics
    • ACLU AI
    • AI Now Institute
    • Center for AI Safety
    • EFF AI
    • European Commission AI
    • Partnership on AI
    • Stanford HAI Policy
    • Mozilla Foundation AI
    • Future of Life Institute
    • Center for AI Safety
    • World Economic Forum AI
  • AI Tools & Product Releases
    • AI Assistants
    • AI for Recruitment
    • AI Search
    • Coding Assistants
    • Customer Service AI
    • Image Generation
    • Video Generation
    • Writing Tools
    • AI for Recruitment
    • Voice/Audio Generation
  • Industry Applications
    • Finance AI
    • Healthcare AI
    • Legal AI
    • Manufacturing AI
    • Media & Entertainment
    • Transportation AI
    • Education AI
    • Retail AI
    • Agriculture AI
    • Energy AI
  • AI Art & Entertainment
    • AI Art News Blog
    • Artvy Blog » AI Art Blog
    • Weird Wonderful AI Art Blog
    • The Chainsaw » AI Art
    • Artvy Blog » AI Art Blog
What's Hot

Shadow AI at Work Is Quietly Rewriting Job Dynamics

CODA: Coordinating the Cerebrum and Cerebellum for a Dual-Brain Computer Use Agent with Decoupled Reinforcement Learning – Takara TLDR

Empowering air quality research with secure, ML-driven predictive analytics

Facebook X (Twitter) Instagram
Advanced AI News
  • Home
  • AI Models
    • OpenAI (GPT-4 / GPT-4o)
    • Anthropic (Claude 3)
    • Google DeepMind (Gemini)
    • Meta (LLaMA)
    • Cohere (Command R)
    • Amazon (Titan)
    • IBM (Watsonx)
    • Inflection AI (Pi)
  • AI Research
    • Allen Institue for AI
    • arXiv AI
    • Berkeley AI Research
    • CMU AI
    • Google Research
    • Meta AI Research
    • Microsoft Research
    • OpenAI Research
    • Stanford HAI
    • MIT CSAIL
    • Harvard AI
  • AI Funding
    • AI Funding Database
    • CBInsights AI
    • Crunchbase AI
    • Data Robot Blog
    • TechCrunch AI
    • VentureBeat AI
    • The Information AI
    • Sifted AI
    • WIRED AI
    • Fortune AI
    • PitchBook
    • TechRepublic
    • SiliconANGLE – Big Data
    • MIT News
    • Data Robot Blog
  • AI Experts
    • Google DeepMind
    • Lex Fridman
    • Meta AI Llama
    • Yannic Kilcher
    • Two Minute Papers
    • AI Explained
    • TheAIEdge
    • The TechLead
    • Matt Wolfe AI
    • Andrew Ng
    • OpenAI
    • Expert Blogs
      • François Chollet
      • Gary Marcus
      • IBM
      • Jack Clark
      • Jeremy Howard
      • Melanie Mitchell
      • Andrew Ng
      • Andrej Karpathy
      • Sebastian Ruder
      • Rachel Thomas
      • IBM
  • AI Tools
    • AI Assistants
    • AI for Recruitment
    • AI Search
    • Coding Assistants
    • Customer Service AI
  • AI Policy
    • ACLU AI
    • AI Now Institute
    • Center for AI Safety
  • Business AI
    • Advanced AI News Features
    • Finance AI
    • Healthcare AI
    • Education AI
    • Energy AI
    • Legal AI
LinkedIn Instagram YouTube Threads X (Twitter)
Advanced AI News
Industry Applications

d-Matrix Takes On AI ‘Memory Wall’ with 3D Stacked In-Memory Compute

By Advanced AI EditorAugust 28, 2025No Comments5 Mins Read
Share Facebook Twitter Pinterest Copy Link Telegram LinkedIn Tumblr Email
Share
Facebook Twitter LinkedIn Pinterest Email


(lafoto/Shutterstock)

The AI revolution has created huge demand for processing power to train frontier models, which Nvidia is filling with its high-end GPUs. But the sudden shift to AI inference and agentic AI in 2025 is exposing gaps in the memory pipeline, which d-Matrix hopes to address with its innovative 3D stacked digital in-memory compute (3DIMC) architecture, which it showed off at Hot Chips this week.

Even before the launch of ChatGPT ignited the AI revolution in late 2022, the folks at d-Matrix had already identified an unfilled need for bigger and faster memory in response to large language models (LLMs). d-Matrix CEO and co-founder Sid Sheth was already predicting a surge in AI inference workloads to result from the promising LLMs from OpenAI and Google that already were turning heads in the AI world and beyond.

“We think this is going to be around for a long time,” Sheth told BigDATAwire in April 2022 about the transformative potential of LLMs. “We think people will essentially kind of gravitate around transformers for the next five to 10 years, and that is going to be the workhorse workload for AI compute for the next five to 10 years.”

Not only did Sheth correctly predict the transformative impact of the transformer model, but he also foresaw it would eventually result in a surge in AI inference workloads. That presented a business opportunity for Sheth and d-Matrix. The problem was that the GPU-based high performance computing architectures that worked well for training ever-bigger LLMs and frontier models were not ideal for running AI inference workloads. In fact, d-Matrix had identified that the problem extended all the way down into DRAM, which could not efficiently move data at the high speeds needed to support the looming AI inference workloads.

Memory growth lags compute growth (Source: d-Matrix)

d-Matrix’s solution to this was to focus on innovation at the memory layer. While DRAM could not keep up with AI inference demands, a faster and more expensive form of memory called SRAM, or static random access memory, was up for the task.

d-Matrix utilized digital in-memory compute (DMIC) technology that fused a processor directly into SRAM modules. Its Nighthawk architecture utilized DMIC chiplets embedded directly on SRAM cards that plug right into the PCI bus while its Jayhawk architecture provided die-to-die offerings for scale-out processing. Both of these architectures were incorporated into the company’s flagship offering, dubbed Corsair, which today utilizes the latest PCIe Gen5 form factor and features ultra-high memory bandwidth of 150 TB/s.

Fast forward to 2025, and many of Sheth’s predictions have come to pass. We are firmly in the midst of a big shift from AI training to AI inference, with agentic AI poised to drive huge investments in the years to come. d-Matrix has kept pace with the needs of emerging AI workloads, and this week announced that its next-generation Pavehawk architecture, which uses three-dimensional stacked DMIC technology (or 3DMIC), is now working in the lab.

Sheth is confident that 3DMIC will provide the performance boost to help AI inference get past the memory wall.

“AI inference is bottlenecked by memory, not just FLOPs. Models are growing fast and traditional HBM memory systems are getting very costly, power hungry and bandwidth limited,” Sheth wrote in a LinkedIn blog post. “3DIMC changes the game. By stacking memory in three dimensions and bringing it into tighter integration with compute, we dramatically reduce latency, improve bandwidth, and unlock new efficiency gains.”

d-Matrix’s new Pavehawk architecture supports 3DMIC technology (Image source d-Matrix)

The memory wall has been looming for years, and is due to a mismatch in the advances of memory and processor technologies. “Industry benchmarks show that compute performance has grown roughly 3x every two years, while memory bandwidth has lagged at just 1.6x,” d-Matrix Founder and CTO Sudeep Bhoja shared in a blog post this week. “The result is a widening gap where pricey processors sit idle, waiting for data to arrive.”

While it won’t completely close the gap with the latest GPUs, 3DMIC technology promises to close the gap, Bhoja wrote. As Pavehawk comes to market, the company is currently developing the next generation of in-memory processing architecture that utilizes 3DMIC, dubbed Raptor.

“Raptor…will incorporate 3DIMC into its design–benefiting from what we and our customers learn from testing on Pavehawk,” Bhoja wrote. “By stacking memory vertically and integrating tightly with compute chiplets, Raptor promises to break through the memory wall and unlock entirely new levels of performance and TCO.”

How much better? According Bhoja, d-Matrix is hoping for 10x better memory bandwidth and 10x better energy efficiency when running AI inference workloads with 3DIMC compared to HBM4.

“These are not incremental gains–they are step-function improvements that redefine what’s possible for inference at scale,” Bhoja wrote. By putting memory requirements at the center of our design–from Corsair to Raptor and beyond–we are ensuring that inference is faster, more affordable, and sustainable at scale.

This article first appeared on our sister publication, BigDATAwire.

Related

About the author: Alex Woodie

Alex Woodie has written about IT as a technology journalist for more than a decade. He brings extensive experience from the IBM midrange marketplace, including topics such as servers, ERP applications, programming, databases, security, high availability, storage, business intelligence, cloud, and mobile enablement. He resides in the San Diego area.



Source link

Follow on Google News Follow on Flipboard
Share. Facebook Twitter Pinterest LinkedIn Tumblr Email Copy Link
Previous ArticleDiscrete Diffusion VLA: Bringing Discrete Diffusion to Action Decoding in Vision-Language-Action Policies – Takara TLDR
Next Article Cohere president Martin Kon stepping down, remains “key member” of the team
Advanced AI Editor
  • Website

Related Posts

Shadow AI at Work Is Quietly Rewriting Job Dynamics

August 29, 2025

Syngenta to launch HPPD-tolerant soybeans for resistant weed control by 2029

August 28, 2025

Elon Musk reveals when SpaceX will perform first-ever Starship catch

August 28, 2025

Comments are closed.

Latest Posts

Egyptian Antiquities Trafficker Sentenced to Six Months in Prison

Sotheby’s to Launch First Series of Luxury Auctions in Abu Dhabi

Nazi-Looted Painting Turns Up in Argentinean Real Estate Listing

Christian Cross Unearthed at Monastic Site in Abu Dhabi

Latest Posts

Shadow AI at Work Is Quietly Rewriting Job Dynamics

August 29, 2025

CODA: Coordinating the Cerebrum and Cerebellum for a Dual-Brain Computer Use Agent with Decoupled Reinforcement Learning – Takara TLDR

August 29, 2025

Empowering air quality research with secure, ML-driven predictive analytics

August 29, 2025

Subscribe to News

Subscribe to our newsletter and never miss our latest news

Subscribe my Newsletter for New Posts & tips Let's stay updated!

Recent Posts

  • Shadow AI at Work Is Quietly Rewriting Job Dynamics
  • CODA: Coordinating the Cerebrum and Cerebellum for a Dual-Brain Computer Use Agent with Decoupled Reinforcement Learning – Takara TLDR
  • Empowering air quality research with secure, ML-driven predictive analytics
  • Nvidia’s AI chip sales surged again in latest quarter, but worries about a tech bubble persist
  • Trump administration’s deal is structured to prevent Intel from selling foundry unit

Recent Comments

  1. Juniorfar on 1-800-CHAT-GPT—12 Days of OpenAI: Day 10
  2. phim sex trẻ em on 1-800-CHAT-GPT—12 Days of OpenAI: Day 10
  3. Sonic Drive-In fraud on 1-800-CHAT-GPT—12 Days of OpenAI: Day 10
  4. VictorGlods on 1-800-CHAT-GPT—12 Days of OpenAI: Day 10
  5. лайки ютуб on 1-800-CHAT-GPT—12 Days of OpenAI: Day 10

Welcome to Advanced AI News—your ultimate destination for the latest advancements, insights, and breakthroughs in artificial intelligence.

At Advanced AI News, we are passionate about keeping you informed on the cutting edge of AI technology, from groundbreaking research to emerging startups, expert insights, and real-world applications. Our mission is to deliver high-quality, up-to-date, and insightful content that empowers AI enthusiasts, professionals, and businesses to stay ahead in this fast-evolving field.

Subscribe to Updates

Subscribe to our newsletter and never miss our latest news

Subscribe my Newsletter for New Posts & tips Let's stay updated!

LinkedIn Instagram YouTube Threads X (Twitter)
  • Home
  • About Us
  • Advertise With Us
  • Contact Us
  • DMCA
  • Privacy Policy
  • Terms & Conditions
© 2025 advancedainews. Designed by advancedainews.

Type above and press Enter to search. Press Esc to cancel.