basetenlabs/ideogram4
forked from ideogram-oss/ideogram4
Captured source
source ↗basetenlabs/ideogram4
Description: Ideogram 4: Open image model at the forefront of design
Language: Python
License: Apache-2.0
Stars: 0
Forks: 0
Open issues: 0
Created: 2026-06-03T20:18:15Z
Pushed: 2026-06-10T03:32:38Z
Default branch: main
Fork: yes
Parent repository: ideogram-oss/ideogram4
Archived: no
README:
Ideogram 4: Open image model at the forefront of design
Ideogram 4 is [Ideogram](https://ideogram.ai)'s first open-weight text-to-image model. It is a state-of-the-art foundation model trained from scratch — not a fine-tune of any existing model. It introduces a new structured JSON prompting interface, with best-in-class multilingual text rendering, deep language understanding, explicit bounding-box layout and color-palette controls, and native 2k resolution images. The easiest way to try the model is online at [ideogram.ai](https://ideogram.ai/).
We believe openness drives innovation, and we invite the research community to innovate with us on the forefront of visual intelligence.
Table of Contents
1. [News](#news) 2. [Model Zoo](#model-zoo) 3. [Performance](#performance) 4. [Quick Start](#quick-start) 5. [Model Summary](#model-summary) 6. [Prompting Guide](#prompting-guide) 7. [Documentation](#documentation) 8. [Citation](#citation)
News
- [2026-06-03] Ideogram 4 released! Inference code and weights
are now public, and our technical blog post is live. See the [Quick Start](#quick-start) section to generate your first image, or try the model online at ideogram.ai.
Model Zoo
| Model | Params | Weight Quantization | Supported Hardware | Diffusers Support | License | | :--- | :---: | :---: | :---: | :---: | :---: | | [Ideogram 4 (nf4)](https://huggingface.co/ideogram-ai/ideogram-4-nf4) | 9.3B | nf4 | CUDA | Yes | [Ideogram 4 Non-Commercial](model_licenses/LICENSE-IDEOGRAM-4-NON-COMMERCIAL) | | [Ideogram 4 (fp8)](https://huggingface.co/ideogram-ai/ideogram-4-fp8) | 9.3B | fp8 | All | No | [Ideogram 4 Non-Commercial](model_licenses/LICENSE-IDEOGRAM-4-NON-COMMERCIAL) |
We plan to support more quantizations in the future.
Performance
We evaluate Ideogram 4 across third-party arenas and benchmarks, standard open-source benchmarks, and our own internal human-preference benchmark. Across all of them, Ideogram 4 is the best open-weight image model by far, and sits at the frontier of design.
Design Arena
Design Arena is a third-party image Elo leaderboard focused specifically on design-oriented generation. On the overall board, Ideogram 4 is the top-ranked open-weight model, trailing only proprietary GPT and Gemini models:
Filtered to open-weight models only, Ideogram 4 leads by a commanding margin, well ahead of the next-best open model:
ContraLabs
ContraLabs ran a blind typography evaluation judged by ten professional designers from Contra's top-earning talent. Ideogram 4 leads on first-place win rate, picked as the best of four models 47.9% of the time overall — well ahead of Gemini 3.1 Flash Image Preview (Nano Banana 2) at 30.0%, FLUX.2 [max] (15.5%), and Grok Imagine 1.0 (15.0%):
It also wins on practical usability: asked "Would you use this in real client work?", the same designers rated Ideogram 4 highest at 3.55 / 5 — significantly above Nano Banana 2 (2.84), Grok Imagine 1.0 (2.61), and FLUX.2 [max] (2.49):
LMArena
On LMArena, a third-party text-to-image leaderboard that measures general-purpose text-to-image use cases, Ideogram is the top-ranked open-weight lab and a top-5 image generation lab overall — beaten only by giant companies with vastly larger budgets and resources:
Ideogram internal eval
For our internal human-preference benchmark, focused on graphic design and photography, we had graphic designers deeply familiar with professional design work do the rating blind. Bradley-Terry scores rank Ideogram 4 #2 overall — behind only GPT Image 2 medium — and the top open-weight model:
Open-source benchmarks
On standard open-source benchmarks measuring core capabilities — layout control (7Bench), spatial reasoning and object fidelity (SpatialGenEval), text rendering (X-Omni OCR), and prompt alignment (Prism) — Ideogram 4 closes the gap to the leading closed-source models across every axis. On layout control (7Bench), it is significantly better than all closed-source models:
At 9.3B parameters, Ideogram 4 delivers the best text rendering of any open-weight release we benchmarked — ahead of much larger models like Qwen-Image (20B), FLUX.2 [dev] (32B), and HunyuanImage 3.0 (80B MoE):
Quick Start
Install
pip install .
If you plan to modify the code, install in editable mode instead so changes under src/ideogram4/ take effect without reinstalling:
pip install -e .
Model access
The model weights are gated on Hugging Face, so you must accept the gate and authenticate before the code can download them — otherwise the download fails with a 404 / GatedRepoError.
1. Open the model page — ideogram-ai/ideogram-4-nf4 (or ideogram-ai/ideogram-4-fp8) — and click Agree and access repository to accept the license gate. 2. Create a Hugging Face access token at huggingface.co/settings/tokens and log in so the download is authenticated:
hf auth login
Alternatively, export the token directly: export HF_TOKEN="hf_...".
CLI
The plain --prompt is rewritten into the structured JSON caption the model expects by a "magic prompt" LLM. By default this uses Ideogram's hosted magic-prompt API, which is free and does the expansion server-side (no local model or system prompt needed). It reads IDEOGRAM_API_KEY — get a key at https://ideogram.ai/api/learn/:
python run_inference.py \ --prompt "a ginger cat wearing a tiny wizard hat reading a spellbook" \ --output out.png \ --quantization "nf4" \ --magic-prompt-key "$IDEOGRAM_API_KEY"
You can also run the expansion through your own LLM provider — one of our magic-prompt system prompt is open source. See the [Prompting Guide](docs/prompting.md#magic-prompt) for details.
For the highest-quality images, set --height 2048 --width 2048 and…
Excerpt shown — open the source for the full document.
Notability
notability 1.0/10Routine fork, no traction