RepoIBM (Granite)IBM (Granite)published Sep 17, 2025seen 5d

ibm-granite/granite-4.0-language-models

Open original ↗

Captured source

source ↗

ibm-granite/granite-4.0-language-models

License: Apache-2.0

Stars: 213

Forks: 24

Open issues: 10

Created: 2025-09-17T17:27:15Z

Pushed: 2026-03-30T21:18:32Z

Default branch: main

Fork: no

Archived: no

README:

:hugs: HuggingFace Collection&nbsp | :speech_balloon: Discussions Page&nbsp | 📘 IBM Granite Docs

---

Overview

Granite 4.0 language models are lightweight, state-of-the-art open foundation models that natively support multilingual capabilities, a wide range of coding tasks—including fill-in-the-middle (FIM) code completion—retrieval-augmented generation (RAG), tool usage, and structured JSON output.

Our models are developed using a combination of advanced techniques such as structured chat formatting, supervised fine-tuning, reinforcement learning–based model alignment, and model merging. Granite 4.0 features significantly improved *instruction-following* and *tool-calling* capabilities, making it highly effective for enterprise applications and an ideal choice for deployment in environments with constrained compute resources.

All models are publicly released under the Apache 2.0 license, allowing free use for both research and commercial purposes. The data curation and training processes were specifically designed for enterprise scenarios and customization, incorporating governance, risk, and compliance (GRC) evaluations alongside IBM’s standard data clearance and document quality review procedures.

The initial release of the Granite 4.0 models included three sizes—micro, tiny, and small—built on dense, dense-hybrid, and mixture-of-experts (MoE) hybrid architectures. Additional model sizes have been added gradually. We provide both base models (checkpoints after pretraining) and instruct models (checkpoints fine-tuned for dialogue, instruction following, helpfulness, and safety).

Core evaluation results for all model variants are provided on their respective model cards, and a more comprehensive extended evaluation is available [here](URL).

How to Use our Models?

To use any of our models, pick an appropriate model_path from: 1. iibm-granite/granite-4.0-micro-base 2. ibm-granite/granite-4.0-micro 3. ibm-granite/granite-4.0-h-micro-base 4. ibm-granite/granite-4.0-h-micro 5. ibm-granite/granite-4.0-h-tiny-base 6. ibm-granite/granite-4.0-h-tiny 7. ibm-granite/granite-4.0-h-small-base 8. ibm-granite/granite-4.0-h-small 9. ibm-granite/granite-4.0-8b-base 10. ibm-granite/granite-4.0-8b

Inference Examples

Basic Inference

This is a simple example of how to use Granite-4.0-H-Small model.

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

device = "auto"
model_path = "ibm-granite/granite-4.0-h-small"
tokenizer = AutoTokenizer.from_pretrained(model_path)
# drop device_map if running on CPU
model = AutoModelForCausalLM.from_pretrained(model_path, device_map=device)
model.eval()
# change input text as desired
chat = [
{ "role": "user", "content": "What is the name of the durable rock known for being one of the hardest natural building stones?"},

]

chat = tokenizer.apply_chat_template(chat, tokenize=False, add_generation_prompt=True)

# tokenize the text
input_tokens = tokenizer(chat, return_tensors="pt").to(device)
# generate output tokens
output = model.generate(**input_tokens,
max_new_tokens=150)
# decode output tokens into text
output = tokenizer.batch_decode(output)
# print output
print(output)

Tool-calling capabilities for AI agents

Agentic tool-calling is shaping the future of AI agents, enabling seamless integration of powerful back-end systems into agent-driven workflows. These trajectories often involve multiple tool calls, handling execution responses, and multi-turn user interactions. While agent frameworks orchestrate long-horizon tasks, LLMs must provide the foundation — including standard tool formats, robust tool-call handling (even in edge cases), and support for feeding back execution results.

The following code example demonstrates how Granite 4.0’s tool-calling capabilities address these needs. In the first user query, the model successfully generates the appropriate tool call because it has access to the necessary tools. In contrast, it produces an apology message for the second query, as the required tooling is unavailable. Since this example does not use an agent framework, tool execution is simulated.

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

device = "cuda"
# model_path = ""
model_path = "ibm-granite/granite-4.0-micro"
tokenizer = AutoTokenizer.from_pretrained(model_path)
# drop device_map if running on CPU
model = AutoModelForCausalLM.from_pretrained(model_path, device_map=device)
model.eval()

chat=[
{"role": "user", "content": "I'm looking to buy a used truck for my construction work, but I want to make sure it's legitimate. The seller provided the VIN: 1FMXK92W8YPA12345 and said it's registered in Georgia. Can you verify if the VIN is valid and matches a registered vehicle?"},
{"role": "assistant",
"content": "",
"tool_calls": [
{
"function": {
"name": "check_valid_vin",
"arguments": {"vin": "1FMXK92W8YPA12345"}
}
}
]
},
{"role": "tool", "content": "{\"valid\": true, \"vin_details\": {\"make\": \"Ford\", \"model\": \"F-150\", \"year\": 2020, \"vehicle_type\": \"Truck\", \"registration_status\": \"Active\", \"registration_state\": \"GA\", \"odometer\": 82345, \"title_status\": \"Clear\", \"lienholder\": null, \"recall_history\": \"No active recalls\"}, \"notes\": \"VIN is valid and registered in Georgia. PPSR lien check complete - no security interests found. License plate verification requires separate DMV lookup which is not currently available through this tool.\"}"},
{"role": "user", "content": "I'm also considering purchasing a new Ford F-150 from an official dealership in Texas. Could you provide a cost estimate for this type of truck in that state?"},
]

tools = [
{
"type": "function",
"function": {
"name": "check_valid_registration",
"description": "Verifies whether a vehicle registration number is valid for a specific state and returns detailed information about the registered vehicle if valid. Use this function to validate vehicle registration status and obtain ownership/vehicle data.",
"parameters": {
"type": "object",
"properties": {
"reg": {
"type": "string",
"description": "Vehicle registration number in standard format (e.g., ABC123 or XYZ-7890)"
},
"state": {
"type": "string",
"description":…

Excerpt shown — open the source for the full document.

Notability

notability 7.0/10

Major model release from IBM, moderate early traction.