Chinese startup Z.ai today open-sourced GLM-4.5, a reasoning model that it claims is more cost-efficient than DeepSeek’s R1.
CNBC reported that the algorithm can run on eight H20 graphics cards. The H20 is a scaled-down version of Nvidia Corp.’s H100 chip, which was its flagship artificial intelligence accelerator until last year. The U.S. government recently greenlit the sale of the former processor to companies in China.
The launch of GLM-4.5 comes about six months after DeepSeek released its open-source R1 reasoning model. At the time, the company stated that the algorithm can perform some tasks using 50 times less hardware than OpenAI’s o1. Furthermore, DeepSeek claimed to have trained its model for a fraction of the cost of earlier AI projects.
R1’s release led to investor concerns that increasingly hardware-efficient language models may lower demand for AI infrastructure. Nvidia’s market capitalization dropped more than $580 billion in the subsequent selloff, setting a new Wall Street record. The release of GLM-4.5 today didn’t lead to a similar drop in AI stocks, but it sends investors another signal that reasoning models are continuing to become more hardware-efficient.
Z.ai reportedly expects to charge 11 cents for every 1 million input tokens entered into GLM-4.5. That’s three cents lower than R1. One million output tokens cost 28 cents, just over one-10th what DeepSeek charges for R1.
One of the main factors behind GLM-4.5’s cost efficiency is that it’s relatively small. The model features 355 billion parameters, or about 316 million less than R1. GLM-4.5 only activates 32 billion of those parameters at any given time to reduce hardware usage.
An AI model comprises numerous code snippets called artificial neurons that each perform a tiny portion of the work involved processing a prompt. Those neurons, in turn, are organized into so-called layers. Z.ai removed some of GLM-4.5’s components to add more layers, an approach that it says helped boost the model’s reasoning skills.
The company trained GLM-4.5 through a multistep workflow. First, it developed an initial version of the model using a dataset that included 15 trillion tokens’ worth of information. Z.ai then honed GLM-4.5’s reasoning skills with several smaller training datasets that together comprised more than 7 trillion tokens.
The company evaluated the model’s capabilities using a dozen popular AI benchmarks. According to Z.ai, GLM-4.5 outperformed multiple popular alternatives including Claude 4 Opus. It ranked third behind xAI Holdings Corp.’s Grok 4 and OpenAI’s o3.
For use cases that place particular emphasis on cost-efficiency, Z.ai has developed a scaled-down version of its model called GLM-4.5-Air. The algorithm features 106 billion parameters, or about three times less than the original. GLM-4.5-Air activates 12 billion parameters to process prompts.
In January, the U.S. Commerce Department added Z.ai to its Entity List of organizations subject to export controls. The company is backed by $1.5 billion in funding from Alibaba Group, Tencent Inc. and other investors. It reportedly plans to file for a public offering later this year.
Image: Unsplash
Support our open free content by sharing and engaging with our content and community.
Join theCUBE Alumni Trust Network
Where Technology Leaders Connect, Share Intelligence & Create Opportunities
11.4k+
CUBE Alumni Network
C-level and Technical
Domain Experts
Connect with 11,413+ industry leaders from our network of tech and business leaders forming a unique trusted network effect.
SiliconANGLE Media is a recognized leader in digital media innovation serving innovative audiences and brands, bringing together cutting-edge technology, influential content, strategic insights and real-time audience engagement. As the parent company of SiliconANGLE, theCUBE Network, theCUBE Research, CUBE365, theCUBE AI and theCUBE SuperStudios — such as those established in Silicon Valley and the New York Stock Exchange (NYSE) — SiliconANGLE Media operates at the intersection of media, technology, and AI. .
Founded by tech visionaries John Furrier and Dave Vellante, SiliconANGLE Media has built a powerful ecosystem of industry-leading digital media brands, with a reach of 15+ million elite tech professionals. The company’s new, proprietary theCUBE AI Video cloud is breaking ground in audience interaction, leveraging theCUBEai.com neural network to help technology companies make data-driven decisions and stay at the forefront of industry conversations.