Close Menu
  • Home
  • AI Models
    • DeepSeek
    • xAI
    • OpenAI
    • Meta AI Llama
    • Google DeepMind
    • Amazon AWS AI
    • Microsoft AI
    • Anthropic (Claude)
    • NVIDIA AI
    • IBM WatsonX Granite 3.1
    • Adobe Sensi
    • Hugging Face
    • Alibaba Cloud (Qwen)
    • Baidu (ERNIE)
    • C3 AI
    • DataRobot
    • Mistral AI
    • Moonshot AI (Kimi)
    • Google Gemma
    • xAI
    • Stability AI
    • H20.ai
  • AI Research
    • Allen Institue for AI
    • arXiv AI
    • Berkeley AI Research
    • CMU AI
    • Google Research
    • Microsoft Research
    • Meta AI Research
    • OpenAI Research
    • Stanford HAI
    • MIT CSAIL
    • Harvard AI
  • AI Funding & Startups
    • AI Funding Database
    • CBInsights AI
    • Crunchbase AI
    • Data Robot Blog
    • TechCrunch AI
    • VentureBeat AI
    • The Information AI
    • Sifted AI
    • WIRED AI
    • Fortune AI
    • PitchBook
    • TechRepublic
    • SiliconANGLE – Big Data
    • MIT News
    • Data Robot Blog
  • Expert Insights & Videos
    • Google DeepMind
    • Lex Fridman
    • Matt Wolfe AI
    • Yannic Kilcher
    • Two Minute Papers
    • AI Explained
    • TheAIEdge
    • Matt Wolfe AI
    • The TechLead
    • Andrew Ng
    • OpenAI
  • Expert Blogs
    • François Chollet
    • Gary Marcus
    • IBM
    • Jack Clark
    • Jeremy Howard
    • Melanie Mitchell
    • Andrew Ng
    • Andrej Karpathy
    • Sebastian Ruder
    • Rachel Thomas
    • IBM
  • AI Policy & Ethics
    • ACLU AI
    • AI Now Institute
    • Center for AI Safety
    • EFF AI
    • European Commission AI
    • Partnership on AI
    • Stanford HAI Policy
    • Mozilla Foundation AI
    • Future of Life Institute
    • Center for AI Safety
    • World Economic Forum AI
  • AI Tools & Product Releases
    • AI Assistants
    • AI for Recruitment
    • AI Search
    • Coding Assistants
    • Customer Service AI
    • Image Generation
    • Video Generation
    • Writing Tools
    • AI for Recruitment
    • Voice/Audio Generation
  • Industry Applications
    • Finance AI
    • Healthcare AI
    • Legal AI
    • Manufacturing AI
    • Media & Entertainment
    • Transportation AI
    • Education AI
    • Retail AI
    • Agriculture AI
    • Energy AI
  • AI Art & Entertainment
    • AI Art News Blog
    • Artvy Blog » AI Art Blog
    • Weird Wonderful AI Art Blog
    • The Chainsaw » AI Art
    • Artvy Blog » AI Art Blog
What's Hot

What Do We Want From Legal AI? – Artificial Lawyer

Paper page – Music Arena: Live Evaluation for Text-to-Music

Anthropic Sets Weekly Limits on Claude AI to Curb Misuse, Maintain Reliability

Facebook X (Twitter) Instagram
Advanced AI News
  • Home
  • AI Models
    • OpenAI (GPT-4 / GPT-4o)
    • Anthropic (Claude 3)
    • Google DeepMind (Gemini)
    • Meta (LLaMA)
    • Cohere (Command R)
    • Amazon (Titan)
    • IBM (Watsonx)
    • Inflection AI (Pi)
  • AI Research
    • Allen Institue for AI
    • arXiv AI
    • Berkeley AI Research
    • CMU AI
    • Google Research
    • Meta AI Research
    • Microsoft Research
    • OpenAI Research
    • Stanford HAI
    • MIT CSAIL
    • Harvard AI
  • AI Funding
    • AI Funding Database
    • CBInsights AI
    • Crunchbase AI
    • Data Robot Blog
    • TechCrunch AI
    • VentureBeat AI
    • The Information AI
    • Sifted AI
    • WIRED AI
    • Fortune AI
    • PitchBook
    • TechRepublic
    • SiliconANGLE – Big Data
    • MIT News
    • Data Robot Blog
  • AI Experts
    • Google DeepMind
    • Lex Fridman
    • Meta AI Llama
    • Yannic Kilcher
    • Two Minute Papers
    • AI Explained
    • TheAIEdge
    • The TechLead
    • Matt Wolfe AI
    • Andrew Ng
    • OpenAI
    • Expert Blogs
      • François Chollet
      • Gary Marcus
      • IBM
      • Jack Clark
      • Jeremy Howard
      • Melanie Mitchell
      • Andrew Ng
      • Andrej Karpathy
      • Sebastian Ruder
      • Rachel Thomas
      • IBM
  • AI Tools
    • AI Assistants
    • AI for Recruitment
    • AI Search
    • Coding Assistants
    • Customer Service AI
  • AI Policy
    • ACLU AI
    • AI Now Institute
    • Center for AI Safety
  • Industry AI
    • Finance AI
    • Healthcare AI
    • Education AI
    • Energy AI
    • Legal AI
LinkedIn Instagram YouTube Threads X (Twitter)
Advanced AI News
VentureBeat AI

Chinese startup Z.ai launches powerful open source GLM-4.5 model family with PowerPoint creation

By Advanced AI EditorJuly 28, 2025No Comments10 Mins Read
Share Facebook Twitter Pinterest Copy Link Telegram LinkedIn Tumblr Email
Share
Facebook Twitter LinkedIn Pinterest Email


Want smarter insights in your inbox? Sign up for our weekly newsletters to get only what matters to enterprise AI, data, and security leaders. Subscribe Now

Another week in the summer of 2025 has begun, and in a continuation of the trend from last week, with it arrives more powerful Chinese open source AI models.

Little-known (at least to us here in the West) Chinese startup Z.ai has introduced two new open source LLMs — GLM-4.5 and GLM-4.5-Air — casting them as go-to solutions for AI reasoning, agentic behavior, and coding.

And according to Z.ai’s blog post, the models perform near the top of the pack of other proprietary LLM leaders in the U.S.

For example, the flagship GLM-4.5 matches or outperforms leading proprietary models like Claude 4 Sonnet, Claude 4 Opus, and Gemini 2.5 Pro on evaluations such as BrowseComp, AIME24, and SWE-bench Verified, while ranking third overall across a dozen competitive tests.

The AI Impact Series Returns to San Francisco – August 5

The next phase of AI is here – are you ready? Join leaders from Block, GSK, and SAP for an exclusive look at how autonomous agents are reshaping enterprise workflows – from real-time decision-making to end-to-end automation.

Secure your spot now – space is limited: https://bit.ly/3GuuPLF

Its lighter-weight sibling, GLM-4.5-Air, also performs within the top six, offering strong results relative to its smaller scale.

Both models feature dual operation modes: a thinking mode for complex reasoning and tool use, and a non-thinking mode for instant response scenarios. They can automatically generate complete PowerPoint presentations from a single title or prompt, making them useful for meeting preparation, education, and internal reporting.

They further offer creative writing, emotionally aware copywriting, and script generation to create branded content for social media and the web. Moreover, z.ai says they support virtual character development and turn-based dialogue systems for customer support, roleplaying, fan engagement, or digital persona storytelling.

While both models support reasoning, coding, and agentic capabilities, GLM-4.5-Air is designed for teams seeking a lighter-weight, more cost-efficient alternative with faster inference and lower resource requirements.

Z.ai also lists several specialized models in the GLM-4.5 family on its API, including GLM-4.5-X and GLM-4.5-AirX for ultra-fast inference, and GLM-4.5-Flash, a free variant optimized for coding and reasoning tasks.

They’re available now to use directly on Z.ai and through the Z.ai application programming interface (API) for developers to connect to third-party apps, and their code is available on HuggingFace and ModelScope. The company also provides multiple integration routes, including support for inference via vLLM and SGLang.

Licensing and API pricing

GLM-4.5 and GLM-4.5-Air are released under the Apache 2.0 license, a permissive and commercially friendly open-source license.

This allows developers and organizations to freely use, modify, self-host, fine-tune, and redistribute the models for both research and commercial purposes.

For those who don’t want to download the model code or weights and self-host or deploy on their own, z.ai’s cloud-based API offers the model for the following prices.

GLM-4.5:

$0.60 / $2.20 per 1 million input/output tokens

GLM-4.5-Air:

$0.20 / $1.10 per 1M input/output tokens

A CNBC article on the models reported that z.ai would charge only $0.11 / $0.28 per million input/output tokens, which is also supported by a Chinese graphic the company posted on its API documentation for the “Air model.”

However, this appears to be the case only for inputting up to 32,000 tokens and outputting 200 tokens at a single time. (Recall tokens are the numerical designations the LLM uses to represent different semantic concepts and word components, the LLM’s native language, with each token translating to a word or portion of a word).

In fact, the Chinese graphic reveals far more detailed pricing for both models per batches of tokens inputted/outputted. I’ve tried to translate it below:

Another note: since z.ai is based in China, those in the West who are focused on data sovereignty will want to due diligence through internal policies to pursue using the API, as it may be subject to Chinese content restrictions.

Competitive performance on third-party benchmarks, approaching that of leading closed/proprietary LLMs

GLM-4.5 ranks third across 12 industry benchmarks measuring agentic, reasoning, and coding performance—trailing only OpenAI’s GPT-4 and xAI’s Grok 4. GLM-4.5-Air, its more compact sibling, lands in sixth position.

In agentic evaluations, GLM-4.5 matches Claude 4 Sonnet in performance and exceeds Claude 4 Opus in web-based tasks. It achieves a 26.4% accuracy on the BrowseComp benchmark, compared to Claude 4 Opus’s 18.8%. In the reasoning category, it scores competitively on tasks such as MATH 500 (98.2%), AIME24 (91.0%), and GPQA (79.1%).

For coding, GLM-4.5 posts a 64.2% success rate on SWE-bench Verified and 37.5% on Terminal-Bench. In pairwise comparisons, it outperforms Qwen3-Coder with an 80.8% win rate and beats Kimi K2 in 53.9% of tasks. Its agentic coding ability is enhanced by integration with tools like Claude Code, Roo Code, and CodeGeex.

The model also leads in tool-calling reliability, with a success rate of 90.6%, edging out Claude 4 Sonnet and the new-ish Kimi K2.

Part of the wave of open source Chinese LLMs

The release of GLM-4.5 arrives amid a surge of competitive open-source model launches in China, most notably from Alibaba’s Qwen Team.

In the span of a single week, Qwen released four new open-source LLMs, including the reasoning-focused Qwen3-235B-A22B-Thinking-2507, which now tops or matches leading models such as OpenAI’s o4-mini and Google’s Gemini 2.5 Pro on reasoning benchmarks like AIME25, LiveCodeBench, and GPQA.

This week, Alibaba continued the trend with the release of Wan 2.2, a powerful new open source video model.

Alibaba’s new models are, like z.ai, licensed under Apache 2.0, allowing commercial usage, self-hosting, and integration into proprietary systems.

The broad availability and permissive licensing of Alibaba’s offerings and Chinese startup Moonshot before it with its Kimi K2 model reflects an ongoing strategic effort by Chinese AI companies to position open-source infrastructure as a viable alternative to closed U.S.-based models.

It also places pressure on the U.S.-based model provider efforts to compete in open source. Meta has been on a hiring spree after its Llama 4 model family debuted earlier this year to a mixed response from the AI community, including a hefty dose of criticism for what some AI power users saw as benchmark gaming and inconsistent performance.

Meanwhile, OpenAI co-founder and CEO Sam Altman recently announced that OpenAI’s long-awaited and much-hyped frontier open source LLM — its first since before ChatGPT launched in late 2022 — would be delayed from its originally planned July release to an as-yet unspecified later date.

Architecture and training lessons revealed

GLM-4.5 is built with 355 billion total and 32 billion active parameters. Its counterpart, GLM-4.5-Air, offers a lighter-weight design at 106 billion total and 12 billion active parameters.

Both use a Mixture-of-Experts (MoE) architecture, optimized with loss-free balance routing, sigmoid gating, and increased depth for enhanced reasoning.

The self-attention block includes Grouped-Query Attention and a higher number of attention heads. A Multi-Token Prediction (MTP) layer enables speculative decoding during inference.

Pre-training spans 22 trillion tokens split between general-purpose and code/reasoning corpora. Mid-training adds 1.1 trillion tokens from repo-level code data, synthetic reasoning inputs, and long-context/agentic sources.

Z.ai’s post-training process for GLM-4.5 relied upon a reinforcement learning phase powered by its in-house RL infrastructure, slime, which separates data generation and model training processes to optimize throughput on agentic tasks.

Among the techniques they used were mixed-precision rollouts and adaptive curriculum learning.
The former help the model train faster and more efficiently by using lower-precision math when generating data, without sacrificing much accuracy.

Meanwhile, adaptive curriculum learning means the model starts with easier tasks and gradually moves to harder ones, helping it learn more complex tasks gradually over time.

GLM-4.5’s architecture prioritizes computational efficiency. According to CNBC, Z.ai CEO Zhang Peng stated that the model runs on just eight Nvidia H20 GPUs — custom silicon designed for the Chinese market to comply with U.S. export controls. That’s roughly half the hardware requirement of DeepSeek’s comparable models.

Interactive demos

Z.ai highlights full-stack development, slide creation, and interactive artifact generation as demonstration areas on its blog post.

Examples include a Flappy Bird clone, Pokémon Pokédex web app, and slide decks built from structured documents or web queries.

Users can interact with these features on the Z.ai chat platform or through API integration.

Company background and market position

Z.ai was founded in 2019 under the name Zhipu, and has since grown into one of China’s most prominent AI startups, according to CNBC.

The company has raised over $1.5 billion from investors including Alibaba, Tencent, Qiming Venture Partners, and municipal funds from Hangzhou and Chengdu, with additional backing from Aramco-linked Prosperity7 Ventures.

Its GLM-4.5 launch coincides with the World Artificial Intelligence Conference in Shanghai, where multiple Chinese firms showcased advancements. Z.ai was also named in a June OpenAI report highlighting Chinese progress in AI, and has since been added to a U.S. entity list limiting business with American firms.

What it means for enterprise technical decision-makers

For senior AI engineers, data engineers, and AI orchestration leads tasked with building, deploying, or scaling language models in production, the GLM-4.5 family’s release under the Apache 2.0 license presents a meaningful shift in options.

The model offers performance that rivals top proprietary systems across reasoning, coding, and agentic benchmarks — yet comes with full weight access, commercial usage rights, and flexible deployment paths, including cloud, private, or on-prem environments.

For those managing LLM lifecycles — whether leading model fine-tuning, orchestrating multi-stage pipelines, or integrating models with internal tools — GLM-4.5 and GLM-4.5-Air reduce barriers to testing and scaling.

The models support standard OpenAI-style interfaces and tool-calling formats, making it easier to evaluate in sandboxed environments or drop into existing agent frameworks.

GLM-4.5 also supports streaming output, context caching, and structured JSON responses, enabling smoother integration with enterprise systems and real-time interfaces. For teams building autonomous tools, its deep thinking mode provides more precise control over multi-step reasoning behavior.

For teams under budget constraints or those seeking to avoid vendor lock-in, the pricing structure undercuts major proprietary alternatives like DeepSeek and Kimi K2. This matters for organizations where usage volume, long-context tasks, or data sensitivity make open deployment a strategic necessity.

For professionals in AI infrastructure and orchestration, such as those implementing CI/CD pipelines, monitoring models in production, or managing GPU clusters, GLM-4.5’s support for vLLM, SGLang, and mixed-precision inference aligns with current best practices in efficient, scalable model serving. Combined with open-source RL infrastructure (slime) and a modular training stack, the model’s design offers flexibility for tuning or extending in domain-specific environments.

In short, GLM-4.5’s launch gives enterprise teams a viable, high-performing foundation model they can control, adapt, and scale, without being tied to proprietary APIs or pricing structures. It’s a compelling option for teams balancing innovation, performance, and operational constraints.

Daily insights on business use cases with VB Daily

If you want to impress your boss, VB Daily has you covered. We give you the inside scoop on what companies are doing with generative AI, from regulatory shifts to practical deployments, so you can share insights for maximum ROI.

Read our Privacy Policy

Thanks for subscribing. Check out more VB newsletters here.

An error occured.



Source link

Follow on Google News Follow on Flipboard
Share. Facebook Twitter Pinterest LinkedIn Tumblr Email Copy Link
Previous ArticleHarmonic, the Robinhood CEO’s AI math startup, launches an AI chatbot app
Next Article OpenAI prepares GPT-5 for roll out
Advanced AI Editor
  • Website

Related Posts

Anthropic throttles Claude rate limits, devs call foul

July 29, 2025

No more links, no more scrolling—The browser is becoming an AI Agent

July 29, 2025

How E2B became essential to 88% of Fortune 100 companies and raised $21 million

July 28, 2025

Comments are closed.

Latest Posts

Picasso’s ‘Demoiselles’ May Not Have Been Inspired by African Art

Catalan National Assembly protested the restitution of murals to Aragon.

UNESCO Adds 26 Sites to World Heritage List

Aspen Art Fair Doubles in Size for 2025 Edition

Latest Posts

What Do We Want From Legal AI? – Artificial Lawyer

July 29, 2025

Paper page – Music Arena: Live Evaluation for Text-to-Music

July 29, 2025

Anthropic Sets Weekly Limits on Claude AI to Curb Misuse, Maintain Reliability

July 29, 2025

Subscribe to News

Subscribe to our newsletter and never miss our latest news

Subscribe my Newsletter for New Posts & tips Let's stay updated!

Recent Posts

  • What Do We Want From Legal AI? – Artificial Lawyer
  • Paper page – Music Arena: Live Evaluation for Text-to-Music
  • Anthropic Sets Weekly Limits on Claude AI to Curb Misuse, Maintain Reliability
  • Jim Cramer Notes IBM Stock Sell-Off Despite Strong Earnings
  • Tesla signs $16.5B deal with Samsung to make AI chips

Recent Comments

  1. binance kód on Anthropic closes $2.5 billion credit facility as Wall Street continues plunging money into AI boom – NBC Los Angeles
  2. 🖨 🔵 Incoming Message: 1.95 Bitcoin from exchange. Claim transfer => https://graph.org/ACTIVATE-BTC-TRANSFER-07-23?hs=40f06aae45d2dc14b01045540f836756& 🖨 on SFC Dialogue丨Jeffrey Sachs says he uses DeepSeek every hour_to_facts_its
  3. 📪 ✉️ Unread Notification: 1.65 BTC from user. Claim transfer >> https://graph.org/ACTIVATE-BTC-TRANSFER-07-23?hs=63f0a8159ef8316c31f5a9a8aca50f39& 📪 on Sean Carroll: Arrow of Time
  4. 🔋 📬 Unread Alert - 1.65 BTC from exchange. Accept funds > https://graph.org/ACTIVATE-BTC-TRANSFER-07-23?hs=db3ef91843302da628b83636ef7db949& 🔋 on Rohit Prasad: Amazon Alexa and Conversational AI | Lex Fridman Podcast #57
  5. 📟 ✉️ New Alert: 1.95 Bitcoin from partner. Review funds => https://graph.org/ACTIVATE-BTC-TRANSFER-07-23?hs=945d7d4685640a791a641ab7baaf111d& 📟 on OpenAI’s $3 Billion Windsurf Acquisition Changes AI Forever

Welcome to Advanced AI News—your ultimate destination for the latest advancements, insights, and breakthroughs in artificial intelligence.

At Advanced AI News, we are passionate about keeping you informed on the cutting edge of AI technology, from groundbreaking research to emerging startups, expert insights, and real-world applications. Our mission is to deliver high-quality, up-to-date, and insightful content that empowers AI enthusiasts, professionals, and businesses to stay ahead in this fast-evolving field.

Subscribe to Updates

Subscribe to our newsletter and never miss our latest news

Subscribe my Newsletter for New Posts & tips Let's stay updated!

LinkedIn Instagram YouTube Threads X (Twitter)
  • Home
  • About Us
  • Advertise With Us
  • Contact Us
  • DMCA
  • Privacy Policy
  • Terms & Conditions
© 2025 advancedainews. Designed by advancedainews.

Type above and press Enter to search. Press Esc to cancel.