IBM AI team releases Granite 4.0 Nano series: compact open source small models built for edge AI

by admin · October 30, 2025

Small models are often hampered by poorly tuned instructions, weak tool usage formats, and lack of governance. IBM artificial intelligence team release Granite 4.0 nma small family of models targeting on-premises and edge inference with enterprise control and open licensing. The range includes 8 models in two sizes: 350M and approx. 1B, with mixed SSM and transformer variants, each with base and indication. Granite 4.0 nm Series models are released under the Apache 2.0 license and provide native architecture support on popular runtimes such as vLLM, llama.cpp and MLX

What’s New in the Granite 4.0 Nano Series?

Granite 4.0 nm Consists of four model series and their base versions. Granite 4.0 H 1B uses a hybrid SSM-based architecture with approximately 1.5B parameters. Granite 4.0 H 350M uses the same mixing method at 350M. For maximum runtime portability, IBM also offers Granite 4.0 1B and Granite 4.0 350M as transformer versions.

granite release	Published dimensions	architecture	Licensing and governance	Main points
Granite 13B, the first Watsonx Granite model	13B basics, 13B guidance, 13B chat later	Decoder transformer only, 8K context	IBM Corporate Terms, Customer Protection	watsonx’s first public Granite model, selected enterprise data, English focus
Granite Code Model (Open)	3B, 8B, 20B, 34B codes, bases and instructions	Decoder Transformer only, 2-stage code training for 116 languages	Apache 2.0	The first fully open Granite series for code intelligence, paper 2405.04324, available on HF and GitHub
Granite 3.0 language model	2B and 8B, Basics and Guidance	Transformer, 128K instruction context	Apache 2.0	Business LLM for RAG, Tool Use, Summary, available on WatsonX and HF
Granite 3.1 Language Model (HF)	1B A400M, 3B A800M, 2B, 8B	Transformer, 128K context	Apache 2.0	Scale ladder of enterprise tasks, including foundation tasks and guidance tasks, the same Granite data recipe
Granite 3.2 Language Model (HF)	2B instruction, 8B instruction	Transformer, 128K, better long tip	Apache 2.0	Improved iteration quality on 3.x and maintained business consistency
Granite 3.3 Language Model (HF)	2B base address, 2B instruction, 8B base address, 8B instruction, all 128K	Decoder transformer only	Apache 2.0	The latest 3.x series of HF before 4.0, adding FIM and better instruction following
Granite 4.0 language model	3B micro, 3B H micro, 7B H tiny, 32B H Small, and transformer variants	Hybrid Mamba 2 plus transformer for H, pure transformer for compatibility	Apache 2.0, ISO 42001, cryptographic signature	The beginning of hybrid generation, lower memory, agent-friendly, same governance across scale
Granite 4.0 Nano Language Model	1B H, 1B H indication, 350M H, 350M H indication, 2B transformer, 2B transformer indication, 0.4B transformer, 0.4B transformer indication, 8 in total	The H model is a hybrid SSM plus transformer, and the non-H model is a pure transformer.	Apache 2.0, ISO 42001, signed, same 4.0 pipeline	Minimal Granite model, designed for edge, native and browser, runs on vLLM, llama.cpp, MLX, watsonx

_{Table created by Marktechpost.com}

Structure and training

The H variant interleaves the SSM layer with the transformer layer. This hybrid design reduces memory growth compared to pure attention, while retaining the commonality of the transformer block. The Nano model does not use a simplified data pipeline. They train using the same Granite 4.0 method and over 15T tokens, then adjust the instructions to provide reliable tool usage and instruction following. This extends the advantages of the larger Granite 4.0 models to sub-2B scale.

Benchmarks and Competitive Environment

IBM compared Granite 4.0 Nano to other sub-2B models, including Qwen, Gemma and LiquidAI LFM. A summary of the reports shows significant gains in general knowledge, math, coding and security at similar parameter budgets. On the agent task, these models outperformed several peers on IFEval and Berkeley Function Calling Leaderboard v3.

Main points

IBM released 8 Granite 4.0 Nano models, each model is 350M, about 1B, using mixed SSM and Transformer variants, base and instructions, all under Apache 2.0.
The hybrid H model (Granite 4.0 H 1B with ~1.5B parameters and Granite 4.0 H 350M with ~350M parameters) reuses the Granite 4.0 training recipe on over 15T tokens, so features are inherited from the larger series rather than reduced data branches.
The IBM team reports that Granite 4.0 Nano is competitive with other sub-2B models such as Qwen, Gemma, and LiquidAI LFM in terms of generality, math, code, and security, and outperforms IFEval and BFCLv3, which are important for tools that use proxies.
All Granite 4.0 models, including Nano, are cryptographically signed, ISO 42001 certified, and released for enterprise use, which provides provenance and governance not provided by typical small community models.
These models are available on Hugging Face and IBM watsonx.ai, with runtime support for vLLM, llama.cpp, and MLX, making local, edge, and browser-level deployment realistic for early-stage AI engineers and software teams.

IBM did the right thing here, it took the same Granite 4.0 training pipeline, the same 15T token scale, the same hybrid Mamba 2 plus transformer architecture, and brought it down to 350M and about 1B so that workloads at the edge and on the device can use the precise governance and provenance stories that the larger Granite model already has. The models are Apache 2.0 compliant, ISO 42001 compliant, cryptographically signed, and ready to run on vLLM, llama.cpp, and MLX. Overall, this is a clean and auditable way to run a small LLM.

Check High frequency model weight and technical details. Please feel free to check out our GitHub page for tutorials, code, and notebooks. In addition, welcome to follow us twitter And don’t forget to join our 100k+ ML SubReddit and subscribe our newsletter. wait! Are you using Telegram? Now you can also join us via telegram.

Asif Razzaq is the CEO of Marktechpost Media Inc. As a visionary entrepreneur and engineer, Asif is committed to harnessing the potential of artificial intelligence for the benefit of society. His most recent endeavor is the launch of Marktechpost, an artificial intelligence media platform that stands out for its in-depth coverage of machine learning and deep learning news that is technically sound and easy to understand for a broad audience. The platform has more than 2 million monthly views, which shows that it is very popular among viewers.

🙌 FOLLOW MARKTECHPOST: Add us as your go-to source on Google.

IBM AI team releases Granite 4.0 Nano series: compact open source small models built for edge AI