ByteDance-Seed/SimFlow
Python
Captured source
source ↗ByteDance-Seed/SimFlow
Description: Official implementation of SimFlow
Language: Python
License: Apache-2.0
Stars: 32
Forks: 1
Open issues: 1
Created: 2025-12-15T02:39:58Z
Pushed: 2025-12-16T02:59:17Z
Default branch: main
Fork: no
Archived: no
README: SimFlow: Simplified and End-to-End Training of Latent Normalizing Flows
Qinyu Zhao1,2   ·   Guangting Zheng2   ·   Tao Yang2   ·   Rui Zhu2†  ·   Xingjian Leng1  ·   Stephen Gould1  ·   Liang Zheng1 
1 Australian National University   2 ByteDance Seed  
† Project Lead  
🌐 Project Page   📃 Paper  
Overview
We train an NF and a VAE in an end-to-end way from scratch. There is no stop-gradient operator, significantly simplifying prior frameworks. The gray modules with the snowflake icon are frozen during training, while colored modules are trained. Solid arrows indicate the forward pass, while dashed arrows denote gradient flows.
On ImageNet 256x256, our end-to-end training framework achieves significantly better generation quality than previous state-of-the-art NF model (STARFlow) with much fewer training epochs.
News and Updates
[2025-12-15] Initial Release with Codebase.
Getting Started
Environment Setup
To set up our environment, please run:
git clone https://github.com/ByteDance-Seed/SimFlow.git cd SimFlow # If you use conda, please uncomment the following lines. # conda create -n simflow python=3.11.2 -y # conda activate simflow pip install -r requirements.txt
Train SimFlow
Please download and extract the training split of the ImageNet-1K dataset.
A sample code for training SimFlow+REPA-E is shown below.
torchrun --nproc_per_node=8 --nnodes=2 --node_rank=${NODE_RANK} --master_addr=${MASTER_ADDR} --master_port=${MASTER_PORT} \
train_vae_w_nf.py \
--seed=0 \
--data_path="./imagenet" \
--output_dir="./output/vae_f16d64_std0_5_simflow_adaln_2222246_repaAlign3/" \
--resume="./output/vae_f16d64_std0_5_simflow_adaln_2222246_repaAlign3/" \
--batch_size=16 \
--checkpointing-steps=100000 --sampling-steps 5000 \
--loss-cfg-path="configs/vae_loss/l1_lpips_kl_gan_joint_training.yaml" \
--vae="vae_f16d64" --use_variational=False --fixed_std=0.5 \
--channels=1152 --blocks=6 --layers_per_block=2,2,2,2,2,46 --num_heads=16 \
--lr_schedule='const_then_cosine' --warmup_epochs=0 --hold_epochs=80 --min_lr=1e-6 --epochs=160 --max-train-steps=800000 \
--enc_type="dinov2-vit-b" --repa_align_depth='-1,-1,1,-1,-1,-1' \
--disturb_latents='none' \
--online_eval --eval_steps=100000 --cfg=0.0Evaluation
ImageNet 256x256 | FID = 2.15
torchrun --nproc_per_node=8 eval_vae_w_nf.py \ --seed=0 --output_dir="output/simflow_imagenet256x256" \ --resume="output/simflow_imagenet256x256" \ --vae="vae_f16d64" --use_variational=False --fixed_std=0.5 \ --channels=1152 --blocks=6 --layers_per_block=2,2,2,2,2,46 --num_heads=16 \ --evaluate --cfg=1.1 --temperature=0.95 --denoising_lr=0.25
ImageNet 256x256 with REPA-E | FID = 1.91
torchrun --nproc_per_node=8 eval_vae_w_nf.py \ --seed=0 --output_dir="output/simflow_imagenet256x256_repae" \ --resume="output/simflow_imagenet256x256_repae" \ --vae="vae_f16d64" --use_variational=False --fixed_std=0.5 \ --channels=1152 --blocks=6 --layers_per_block=2,2,2,2,2,46 --num_heads=16 \ --evaluate --cfg=1.1 --temperature=0.975 --denoising_lr=0.25
ImageNet 512x512 with REPA-E | FID = 2.74
torchrun --nproc_per_node=8 eval_vae_w_nf.py \ --seed=0 --output_dir="output/simflow_imagenet512x512_repae" \ --resume="output/simflow_imagenet512x512_repae" \ --resolution=512 \ --vae="vae_f16d64" --use_variational=False --fixed_std=0.5 \ --channels=1152 --blocks=6 --layers_per_block=2,2,2,2,2,46 --num_heads=16 \ --evaluate --cfg=1.0 --temperature=1.0 --eval_bsz=64
Pretrained Models
We also provide pretrained models, which can be downloaded on HuggingFace.
Acknowledgement
This codebase builds upon several excellent open-source projects, including:
We sincerely thank the authors for making their work and models publicly available.
Citation
If you find our work useful, please consider citing:
@article{zhao2025simflow,
title={SimFlow: Simplified and End-to-End Training of Latent Normalizing Flows},
author={Zhao, Qinyu and Zheng, Guangting and Yang, Tao and Zhu, Rui and Leng, Xingjian and Gould, Stephen and Zheng, Liang},
journal={arXiv preprint arXiv:2512.04084},
year={2025}
}Notability
notability 3.0/10New repo, low stars, no buzz