IBM’s newest mainframe generation, the z17, is now generally available.
Announced in April this year, the mainframe has been engineered specifically to support AI capabilities and is the result of five years of development.
The z17 features a second-generation on-chip AI accelerator built on the IBM Telum II processor, and can process 50 percent more AI inference operations per day than its predecessor, the z16.
As a result, the company says the new mainframe features increased frequency and compute capacity alongside a 40 percent growth in cache, enabling more than 450 billion inferencing operations in a day and a one-millisecond response time.
The z17 will also feature the 32-core IBM Spyre accelerator – due to launch in Q4 – to provide additional AI compute capabilities alongside the Telum II.
Speaking to DCD at the z17 launch event, IBM’s general manager of Z and LinuxOne, Ross Mauri, said that while the z17 took five years to develop, the company overlaps its projects. “We always overlap the development. We’re working on the next three: the next two in earnest, and the third one we are doing research for technology selection.
“You have to know you are choosing the right node for the microprocessor, because the design needs to be compatible. You can’t take a design that works in five nanometers, pick it up and move it to seven, or vice versa.”
According to Mauri, when choosing the technology, IBM asks questions such as: Is the technology reliable? Is it at the right cost point? Can the manufacturer provide it? IBM is currently using Samsung as its foundry.
Working so far in advance can be challenging, particularly with technologies like AI and the hardware needed for it developing so rapidly. Mauri said on this: “It depends on how well you read the tea leaves.”
“We’ve over-engineered it to the point where we don’t have any clients pushing the AI capability of the z16 anywhere near what it’s capable of. So, what we are doing, is guessing ahead.
“The things that actually change are the models, frameworks, the modeling techniques, and that’s all done through software. So let’s say a year into our hardware is out there, and there’s a new AI model. Well, what we have to do is we take the models, which are trained anywhere, and you import them into Z, and we run them through our deep learning compiler, which optimizes it for the hardware,” he explained.
With the z16 being over-engineered to the point of surpassing client need, it raises the question as to why companies would switch to the new model.
According to Mauri, most clients lease the Z systems, and it becomes “financially advantageous” to get something “bigger and better.”
“The z17 is significantly faster, has significantly more capacity, and more memory. There’s a lot more hardware in there, and clients’ workloads are growing.”
Mainframe computers are often assumed to be dated technology, having been around for decades. Mauri acknowledged the so-called “demise of the mainframe,” though said that the technology still has a “unique place” and does processing that other technologies can’t.
“That little space that the mainframe participates in, you can’t really run that anywhere else. So that’s why we brought AI into the mainframe.”
The primary sectors Mauri said that use mainframes are the likes of financial services companies, airlines, governments, and more, though he notes that banking dominates the industry landscape.
The now available z17 features an AI accelerator built on the Telum II processor, but is set in the future to add the IBM Spyre accelerator. According to Mauri, once those accelerators are shipped, an IBM service rep will manually plug the cards into customers’ z17s.