IBM has released Granite 4.0, its latest family of open large language models (LLMs), featuring a hybrid Mamba/transformer architecture to reduce memory requirements and hardware costs. The company announced the launch on October 2, 2025.
According to IBM, Granite 4.0 models can run on significantly cheaper GPUs while maintaining performance. “Granite 4.0 features a new hybrid Mamba/transformer architecture that greatly reduces memory requirements without sacrificing performance,” the company said
The models are open-sourced under the Apache 2.0 license and are the first open models to receive ISO 42001 certification, confirming alignment with international standards for AI security, governance and transparency. All Granite 4.0 checkpoints are cryptographically signed to verify provenance and authenticity.
Granite 4.0 is available through IBM watsonx.ai and platform partners, including Dell Technologies, Docker Hub, Hugging Face, Kaggle, LM Studio, NVIDIA NIM, Ollama, OPAQUE and Replicate. Access via Amazon SageMaker JumpStart and Microsoft Azure AI Foundry is planned.
The release includes Granite-4.0-H-Small (32B parameters, 9B active), Granite-4.0-H-Tiny (7B parameters, 1B active), and Granite-4.0-H-Micro (3B parameters). A conventional transformer variant, Granite-4.0-Micro, is also available for platforms that do not yet support hybrid architectures.
IBM stated that Granite 4.0 models reduce RAM usage by more than 70% compared to transformer-based models in tasks involving long inputs and concurrent sessions. “The more you throw at them, the more their advantages are apparent,” the company said.
Benchmarking showed that Granite 4.0 models outperform earlier Granite releases, with Granite-4.0-H-Small exceeding all open-weight models except Llama 4 Maverick on Stanford’s IFEval benchmark. On the Berkeley Function Calling Leaderboard v3, Granite 4.0 models kept pace with larger competitors at lower cost.
Enterprise partners, including EY and Lockheed Martin, tested Granite 4.0 before launch. IBM has also partnered with HackerOne on a bug bounty program, offering up to $100,000 for vulnerabilities or jailbreak exploits.
Training for Granite 4.0 was conducted on a 22T-token enterprise-focused corpus, with separate instruction-tuned models available today and reasoning-focused models planned later this fall. Additional releases, including Granite 4.0 Medium and Granite 4.0 Nano, are expected by year-end.
IBM said this release aims to lower barriers to entry by providing enterprises and open-source developers alike with cost-effective access to highly competitive LLMs.