What does this model signal mean?

NVIDIA published nvidia/AnyFlow-FAR-Wan2.1-14B-Diffusers. This model signal is evidence of what shipped on model infrastructure and how the release is positioned. High-signal details: license other · 159 HF downloads · Low traction model release by major lab. onlylabs links this event to 1 captured evidence page and 6 related model signals.

NVIDIA Model: nvidia/AnyFlow-FAR-Wan2.1-14B-Diffusers

Captured source

source ↗

Hugging Face/huggingface.co/nvidia/AnyFlow-FAR-Wan2.1-14B-Diffusers

nvidia/AnyFlow-FAR-Wan2.1-14B-Diffusers model card

Source ↗

published May 13, 2026seen 5dcaptured 15hhttp 200method plaintask text-to-videolicense otherlibrary diffusersdownloads 159likes 10

AnyFlow

🖥️ GitHub ｜ 🤗 Hugging Face ｜ 📑 Paper ｜ 🌐 Website

-----

AnyFlow: Any-Step Video Diffusion Model with On-Policy Flow Map Distillation

In this repository, we present AnyFlow, the first any-step video diffusion framework built on flow maps. AnyFlow offers these key features:

⚡ Any-Step Generation: Unlike traditional distilled models tied to fixed step budgets, AnyFlow enables a single model to adapt to arbitrary inference budgets. It achieves high-quality few-step generation while providing stable improvements as more sampling steps are added.

🔀 Multiple Architectures: AnyFlow supports any-step distillation for both causal and bidirectional video diffusion models.

🎬 Multiple Tasks: AnyFlow supports Text-to-Video, Image-to-Video, and Video-to-Video generation within one causal video diffusion model.

📈 Scalable Performance: AnyFlow is validated from 1.3B up to 14B parameters.

This directory contains AnyFlow-FAR-Wan2.1-14B-Diffusers (a 14B causal video diffusion model) in Hugging Face Diffusers format, derived from the **Wan2.1-T2V-14B-Diffusers** text-to-video backbone.

Video Demos

🔥 Latest News!!

May 4, 2026: 👋 We've released the codebase and weights of AnyFlow.

Quickstart

Setup Environment

1️⃣ Create Conda Environment

conda create -n far python=3.10
conda activate far

2️⃣ Install PyTorch and Dependencies

pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu128
pip install -r requirements.txt --no-build-isolation

Model Download

| Model | Tasks | Resolution | Download Link | | ----- | ----- | ---------- | ------------- | | AnyFlow-FAR-Wan2.1-1.3B-Diffusers | T2V, I2V, V2V | 480P | 🤗 Hugging Face | | AnyFlow-FAR-Wan2.1-14B-Diffusers | T2V, I2V, V2V | 480P | 🤗 Hugging Face | | AnyFlow-Wan2.1-T2V-14B-Diffusers | T2V | 480P | 🤗 Hugging Face | | AnyFlow-Wan2.1-T2V-1.3B-Diffusers | T2V | 480P | 🤗 Hugging Face |

Download models using 🤗 hf download:

pip install "huggingface_hub[cli]"

hf download nvidia/AnyFlow-FAR-Wan2.1-1.3B-Diffusers --repo-type model --local-dir experiments/pretrained_models/AnyFlow-FAR-Wan2.1-1.3B-Diffusers

Run Text-to-Video Generation with Diffusers

import torch
from diffusers.utils import export_to_video

from far.pipelines.pipeline_far_wan_anyflow import FARWanAnyFlowPipeline

model_id = "nvidia/AnyFlow-FAR-Wan2.1-14B-Diffusers"
pipeline = FARWanAnyFlowPipeline.from_pretrained(model_path).to('cuda', dtype=torch.bfloat16)

prompt = "CG game concept digital art, a majestic elephant with a vibrant tusk and sleek fur running swiftly towards a herd of its kind."

video = pipeline(
prompt=prompt,
height=480,
width=832,
num_frames=81,
num_inference_steps=4,
generator=torch.Generator('cuda').manual_seed(0)
).frames[0]
export_to_video(output, "output.mp4", fps=16)

Run Image-to-Video Generation with Diffusers

import torch
from diffusers.utils import export_to_video
from PIL import Image
from torchvision import transforms

from far.pipelines.pipeline_far_wan_anyflow import FARWanAnyFlowPipeline

model_id = "nvidia/AnyFlow-FAR-Wan2.1-14B-Diffusers"
pipeline = FARWanAnyFlowPipeline.from_pretrained(model_path).to('cuda', dtype=torch.bfloat16)

# load image
image_path = 'assets/example_image.jpg'
prompt = 'A towering, battle-scarred humanoid robot walking through the skeletal remains of a city ruin.'

image = Image.open(image_path).convert('RGB')
image = transforms.ToTensor()(transforms.Resize([480, 832])(image)).unsqueeze(0).unsqueeze(0)

video = pipeline(
prompt=prompt,
context_sequence={'raw': image},
height=480,
width=832,
num_frames=81,
num_inference_steps=4,
generator=torch.Generator('cuda').manual_seed(0)
).frames[0]
export_to_video(output, "output.mp4", fps=16)

Run Video-to-Video Generation with Diffusers

import torch
from diffusers.utils import export_to_video
import decord
from torchvision import transforms

from far.pipelines.pipeline_far_wan_anyflow import FARWanAnyFlowPipeline

decord.bridge.set_bridge('torch')

model_id = "nvidia/AnyFlow-FAR-Wan2.1-14B-Diffusers"
pipeline = FARWanAnyFlowPipeline.from_pretrained(model_path).to('cuda', dtype=torch.bfloat16)

# load video
video_path = 'assets/example_video.mp4'
prompt = "A focused trail runner's powerful strides through a dense, sun-dappled forest."

video_reader = decord.VideoReader(video_path)
frame_idxs = select_frame_indices(len(video_reader), video_reader.get_avg_fps(), target_fps=16)[:num_cond_frames]
frames = video_reader.get_batch(frame_idxs)
frames = (frames / 255.0).float().permute(0, 3, 1, 2).contiguous()
frames = transforms.Resize([480, 832])(frames).unsqueeze(0)

video = pipeline(
prompt=prompt,
context_sequence={'raw': frames},
height=480,
width=832,
num_frames=81,
num_inference_steps=4,
generator=torch.Generator('cuda').manual_seed(0)
).frames[0]
export_to_video(output, "output.mp4", fps=16)

License

This model is released under the NVIDIA One-Way Noncommercial License ([NSCLv1](LICENSE.md)).

Under the NVIDIA One-Way Noncommercial License (NSCLv1), NVIDIA confirms:

Models are not for commercial use.
NVIDIA does not claim ownership to any outputs generated using the Models or Derivative Models.

Citation

If you find our work helpful, please cite us.

@article{gu2026anyflow,
title={AnyFlow: Any-Step Video Diffusion Model with On-Policy Flow Map Distillation},
author={Gu, Yuchao and Fang, Guian and Jiang, Yuxin and Mao, Weijia and Han, Song and Cai, Han and Shou, Mike Zheng},
journal={arXiv preprint arXiv:2605.13724},
year={2026}
}

@article{gu2025long,
title={Long-Context Autoregressive Video Modeling with Next-Frame Prediction},
author={Gu, Yuchao and Mao, weijia and Shou, Mike Zheng},
journal={arXiv preprint arXiv:2503.19325},
year={2025}
}

Acknowledgements

This codebase is built on Diffusers. We also refer to implementations from FAR, Self-Forcing, and…

Excerpt shown — open the source for the full document.

Notability

notability 4.0/10

Low traction model release by major lab