IBM is the first cloud service provider to make Intel® Gaudi® 3 AI accelerators available to customers, a move designed to make powerful artificial intelligence capabilities more accessible and to directly address the high cost of specialized AI hardware.
For Intel, the rollout on IBM Cloud marks the first major commercial deployment of Gaudi 3, bringing choice to the market. By leveraging Intel Gaudi 3 on IBM Cloud, the two companies aim to help clients cost-effectively test, innovate and deploy GenAI solutions.
According to a recent forecast by research firm Gartner, worldwide generative AI (GenAI) spending is expected to total $644 billion in 2025, an increase of 76.4% from 2024. The research found “GenAI will have a transformative impact across all aspects of IT spending markets, suggesting a future where AI technologies become increasingly integral to business operations and consumer products.”
For many enterprise customers, the benefits are clear when tools like GenAI automate tasks, improve workflows and drive innovation. But deploying AI applications demands significant computing power, often requiring expensive specialized processors that can keep many businesses from benefiting from AI.
Gaudi 3 AI accelerators are specifically designed to help meet the exploding demands for GenAI, large model inferencing and model fine-tuning while supporting an open development framework. Gaudi 3 is also ideal for multimodal large language models (LLMs) and retrieval-augmented generation (RAG).
“By bringing Intel Gaudi 3 AI accelerators to IBM Cloud, we’re enabling businesses to help scale generative AI workloads with optimized performance for inferencing and fine-tuning,” said Saurabh Kulkarni, vice president of Data Center AI Strategy at Intel. “This collaboration underscores our shared commitment to making AI more accessible and cost-effective for enterprises worldwide.”
How Enterprise Customers Use IBM Cloud
IBM Cloud serves a range of enterprise customers, particularly those in regulated industries, such as financial services, healthcare and life sciences, and the public sector.
Banks and insurance companies use the cloud for fraud detection or personalized customer service, while healthcare providers use it for accelerating drug discovery and development, AI-driven diagnostics, telemedicine platforms and real-time patient monitoring. Retailers use cloud technology for e-commerce platforms or inventory management. It’s also a go-to for companies looking to modernize old systems without giving up control or security.
Gaudi 3 is now available in the IBM Cloud regions of Frankfurt, Germany; Washington, D.C.; and Dallas, Texas.
Gaudi 3 is also being integrated into IBM’s broader AI infrastructure offerings. Customers can use Gaudi 3 via IBM Cloud Virtual Servers on the IBM Virtual Private Cloud (VPC) now. Customers will also be able to deploy across architectures starting in the second half of 2025. Support for Red Hat OpenShift and IBM’s watsonx AI platform is expected to be available this quarter.
“The ability to handle more data, and have higher performance, all of this is going to drive better adoption of AI for customers worldwide,” says Satinder Sethi, general manager of IBM Cloud Infrastructure Services. “Intel Gaudi 3 is giving customers more choice, more freedom and a more cost-effective platform of which AI hardware they want to use.”
Cost and Performance Comparisons
Intel Gaudi 3 AI accelerators are designed to tackle the cost challenge by balancing performance and price. New AI inferencing benchmark tests conducted by research firm Signal65, and commissioned by Intel, found Gaudi 3 is 92% more cost efficient (performance per dollar) than competition when running on Meta’s Llama-3.1-405B-Instruct-FP8 model with large context sizes1.
Cost efficiency is a crucial metric because it allows businesses to do more AI processing for the same investment or the same amount of processing at a lower cost. Performance gains are intended to lower the cost barrier for companies looking to deploy or fine-tune models, particularly as GenAI adoption spreads.
Throughput or performance measurements refer to the amount of AI processing the accelerator can perform in each time, also known as tokens per second. Gaudi 3 delivers significantly faster AI processing than the competition. On the IBM Granite-3.1-8B-Instruct model, Gaudi 3 provided 43% more tokens per second for small AI workloads1, and 36% more tokens per second with large context sizes compared to competition when running Meta’s Llama-3.1-405B-Instruct-FP8 model1.
More: IBM Empowers Enterprises to Scale AI (Intel.com) | Intel and IBM Announce the Availability of Intel Gaudi 3 AI Accelerators on IBM Cloud (IBM)