inclusionAI/dInfer v0.2.0
inclusionAI/dInfer
Captured source
source ↗published Dec 21, 2025seen 5dcaptured 14hhttp 200method plain
v0.2.0
Repository: inclusionAI/dInfer
Tag: v0.2.0
Published: 2025-12-21T07:11:08Z
Prerelease: no
Release notes: This release delivers multiple major features:
- support block diffusion LLM (LLaDA2-preview and LLaDA2);
- add an optimized batch inference on block diffusion LLM;
- support long sequence generation;
- support native CUDA graph capture and management;
- add SGLang as a backend for block diffusion LLM inference;
- support FP8 quant for LLaDA2-preview and LLaDA2;
- add an experimental feature to support the integration of dInfer and SGLang;
- lm_eval support two benchmarks (gsm8k and mbpp) on LLaDA2-preview and LLaDA2
Notability
notability 4.0/10Routine version update, no notable traction