ModelStepFunStepFunpublished Sep 7, 2025seen 5d

stepfun-ai/Step1X-Edit-v1p2-preview

Open original ↗

Captured source

source ↗
published Sep 7, 2025seen 5dcaptured 11hhttp 200method plaintask image-to-imagelicense apache-2.0library diffusersdownloads 11likes 17

🔥🔥🔥 News!!

  • Sep 08, 2025: 👋 We release step1x-edit-v1p2-preview, a new version of Step1X-Edit with reasoning edit ability and better performance (report to be released soon), featuring:
  • Native Reasoning Edit Model: Combines instruction reasoning with reflective correction to handle complex edits more accurately. Performance on KRIS-Bench:

| Models | Factual Knowledge ⬆️ | Conceptual Knowledge ⬆️ | Procedural Knowledge ⬆️ | Overall ⬆️ | |:------------:|:------------:|:------------:| :------------:|:------------:| | Step1X-Edit v1.1 | 53.05 | 54.34 | 44.66 | 51.59 | | Step1x-edit-v1p2-preview | 60.49 | 58.81 | 41.77 | 52.51 | | Step1x-edit-v1p2-preview (thinking) | 62.24 | 62.25 | 44.43 | 55.21| | Step1x-edit-v1p2-preview (thinking + reflection) | 62.94 | 61.82 | 44.08 | 55.64 |

  • Improved image editing quality and better instruction-following performance. Performance on GEdit-Bench:

| Models | G_SC ⬆️ | G_PQ ⬆️ | G_O ⬆️ | Q_SC ⬆️ | Q_PQ ⬆️ | Q_O ⬆️ | |:------------:|:------------:|:------------:| :------------:|:------------:| :------------:|:------------:| | Step1X-Edit (v1.0) | 7.13 | 7.00 | 6.44 | 7.39 | 7.28 | 7.07 | | Step1X-Edit (v1.1) | 7.66 | 7.35 | 6.97 | 7.65 | 7.41 | 7.35 | | Step1x-edit-v1p2-preview | 8.14 | 7.55 | 7.42 | 7.90 | 7.34 | 7.40 |

🧩 Model Usages

Install the diffusers package from the following command:

git clone -b dev/MergeV1-2 https://github.com/Peyton-Chen/diffusers.git
cd diffusers
pip install -e .

Here is an example for using the Step1XEditPipelineV1P2 class to edit images with thinking and reflection:

import torch
from diffusers import Step1XEditPipelineV1P2
from diffusers.utils import load_image
pipe = Step1XEditPipelineV1P2.from_pretrained("stepfun-ai/Step1X-Edit-v1p2-preview", torch_dtype=torch.bfloat16)
pipe.to("cuda")
print("=== processing image ===")
image = load_image("examples/0000.jpg").convert("RGB")
prompt = "add a ruby ​​pendant on the girl's neck."
enable_thinking_mode=True
enable_reflection_mode=True
pipe_output = pipe(
image=image,
prompt=prompt,
num_inference_steps=28,
true_cfg_scale=4,
generator=torch.Generator().manual_seed(42),
enable_thinking_mode=enable_thinking_mode,
enable_reflection_mode=enable_reflection_mode,
)
if enable_thinking_mode:
print("Reformat Prompt:", pipe_output.reformat_prompt)
for image_idx in range(len(pipe_output.images)):
pipe_output.images[image_idx].save(f"0001-{image_idx}.jpg", lossless=True)
if enable_reflection_mode:
print(pipe_output.think_info[image_idx])

The results will look like:

📑 Model introduction

Framework of Step1X-Edit. Step1X-Edit leverages the image understanding capabilities of MLLMs to parse editing instructions and generate editing tokens, which are then decoded into images using a DiT-based network.More details please refer to our technical report.

We release GEdit-Bench as a new benchmark, grounded in real-world usages is developed to support more authentic and comprehensive evaluation. This benchmark, which is carefully curated to reflect actual user editing needs and a wide range of editing scenarios, enables more authentic and comprehensive evaluations of image editing models. Part results of the benchmark are shown below:

Citation

@article{liu2025step1x-edit,
title={Step1X-Edit: A Practical Framework for General Image Editing},
author={Shiyu Liu and Yucheng Han and Peng Xing and Fukun Yin and Rui Wang and Wei Cheng and Jiaqi Liao and Yingming Wang and Honghao Fu and Chunrui Han and Guopeng Li and Yuang Peng and Quan Sun and Jingwei Wu and Yan Cai and Zheng Ge and Ranchen Ming and Lei Xia and Xianfang Zeng and Yibo Zhu and Binxing Jiao and Xiangyu Zhang and Gang Yu and Daxin Jiang},
journal={arXiv preprint arXiv:2504.17761},
year={2025}
}

Notability

notability 1.0/10

Very low traction, preview model