What does this repo signal mean?

IBM (Granite) published ibm-granite/granite-vision-models (Jupyter Notebook). This repository signal exposes tooling, eval, infrastructure, or model-adjacent work before it may appear in a launch post. High-signal details: repo ibm-granite/granite-vision-models · language Jupyter Notebook · New vision models from IBM, moderate stars.. onlylabs links this event to 1 captured evidence page and 6 related repo signals.

IBM (Granite) Repo: ibm-granite/granite-vision-models

Captured source

source ↗

GitHub/github.com/ibm-granite/granite-vision-models

ibm-granite/granite-vision-models repository metadata

Source ↗

published Feb 17, 2025seen Jun 5captured Jun 11http 200method plain

ibm-granite/granite-vision-models

Language: Jupyter Notebook

License: Apache-2.0

Stars: 46

Forks: 10

Open issues: 5

Created: 2025-02-17T21:42:13Z

Pushed: 2026-04-29T22:17:48Z

Default branch: main

Fork: no

Archived: no

README:

:books: Granite Vision Paper | :bar_chart: ChartNet CVPR 2026 Paper | :hugs: HuggingFace Collection | :speech_balloon: Discussions Page

Granite Vision Models

Granite Vision is a family of multimodal vision‑language models designed to support enterprise‑grade document understanding tasks, including charts, tables, key‑value extraction, and structured image‑to‑text generation. This repository provides documentation, examples, and pointers to available model releases and datasets.

---

🚀 Latest Release: Granite‑Vision-4.1‑4B

Granite‑Vision-4.1‑4B is a vision‑language model tailored for enterprise document data extraction, delivered as a LoRA adapter on top of Granite-4.1-3B.

It supports:

Chart extraction — Chart‑to‑CSV, Chart‑to‑Summary, Chart‑to‑Code
Table extraction — JSON, HTML, and OTSL
Semantic KVP extraction — Schema‑guided extraction across diverse document layouts
Image‑to‑text — Natural‑language descriptions of images

Granite‑Vision-4.1‑4B preserves and extends Granite Vision 4 capabilities while providing more specialized extraction workflows.

---

📊 ChartNet Dataset

ChartNet is a million‑scale multimodal dataset created to support robust chart understanding tasks: ➡️ https://huggingface.co/datasets/ibm-granite/ChartNet

It includes:

1.7M synthetic charts with aligned images, code, tables, summaries, and reasoning
94,643 human‑verified charts
2,000 human‑verified test samples
24 chart types, across 6 plotting libraries

ChartNet uses a code‑guided synthesis pipeline, producing tightly aligned visual, numerical, and textual components. It was used during training for Granite‑Vision-4.1‑4B.

---

📚 Legacy Granite Vision Models

Older Granite Vision models remain available for users who rely on earlier releases:

Granite Vision 4 (3B)

https://huggingface.co/ibm-granite/granite-4.0-3b-vision

Granite Vision 3.3 (2B)

https://huggingface.co/ibm-granite/granite-vision-3.3-2b

Granite Vision 3.1 (2B Preview)

https://huggingface.co/ibm-granite/granite-vision-3.1-2b-preview

Granite Vision 3.3 (GGUF‑converted)

https://huggingface.co/ibm-granite/granite-vision-3.3-2b-GGUF

---

License

All Granite Vision Models are distributed under [Apache 2.0](./LICENSE) license.

---

Would you like to provide feedback?

Please let us know your comments about our family of language models by visiting our Hugging Face model collection: https://huggingface.co/collections/ibm-granite/granite-vision-models-67b3bd4ff90c915ba4cd2800

Select the model repository you would like to provide feedback about, go to the Community tab, and click New discussion.

Alternatively, you may also post questions or comments on our GitHub discussions page: https://github.com/orgs/ibm-granite/discussions

---

Ethical Considerations and Limitations

The use of Large Vision and Language Models involves important risks, including bias, fairness concerns, misinformation, and challenges around autonomous decision‑making. Granite‑vision‑3.2‑2b is no exception.

Although alignment processes incorporate safety considerations, the model may sometimes produce inaccurate, biased, or unsafe responses. Smaller models in particular may exhibit increased susceptibility to hallucination, an active area of ongoing research.

We urge the community to deploy Granite Vision models responsibly, especially for document‑understanding tasks. More general vision tasks may carry higher risks of harmful or biased outputs.

To enhance safety, we recommend using Granite Vision models alongside Granite Guardian, a fine‑tuned model designed to detect and flag risks across dimensions from the IBM AI Risk Atlas.

---

Contributing

Issues and pull requests are welcome. Please open a GitHub issue to report bugs or suggest enhancements.

Excerpt shown — open the source for the full document.

Notability

notability 6.0/10

New vision models from IBM, moderate stars.