What does this release signal mean?

Arcee AI published arcee-ai/NeMo-RL v1.1.0-rc0 (arcee-ai/NeMo-RL). This release signal is evidence of what shipped, changed, or was packaged for users. High-signal details: Minor release candidate of an existing project, limited excitement · v1.1.0-rc0 Repository: arcee-ai/NeMo-RL Tag: v1.1.0-rc0 Published: 2025-10-15T03:23:51Z Prerelease: yes Release notes: Since v1.0.1: Breaking Changes: - Megatron removed.... onlylabs links this event to 1 captured evidence page and 6 related release signals.

Arcee AI Release: arcee-ai/NeMo-RL v1.1.0-rc0

Captured source

source ↗

GitHub/github.com/arcee-ai/NeMo-RL

arcee-ai/NeMo-RL v1.1.0-rc0

Source ↗

published Oct 15, 2025seen Jun 5captured Jun 11http 200method plain

v1.1.0-rc0

Repository: arcee-ai/NeMo-RL

Tag: v1.1.0-rc0

Published: 2025-10-15T03:23:51Z

Prerelease: yes

Release notes: Since v1.0.1:

Breaking Changes:

Megatron removed - no need for the megatron_cfg block anymore.
Legacy environments removed
Legacy eval harness removed
Dataset options other than dataset.shuffle have been removed.
"Max rollout turns" config option has been removed - implement this in your verifiers environments.

Changelog:

Added grpo.interleave_rolluts. Set it to true to run one step off-policy (consider enabling importance sampling to compensate) and generate the next step's rollouts while you train on the current step's data.
Added checkpointing.hf_checkpoint. Set it to true to checkpoint directly to HF (slower than DCP).
Added new training path: examples/run_sft.py. See examples/configs/sft/afm_pocket_sft.yaml for full configuration.
Added support for Muon via dion. To use it, specify dion.MuonReference as your optimizer, and specify policy.optimizer.scalar_optim as adamw for non-applicable parameters.
Rename project to "RLKit".
Removed DPO training path.
Legacy evaluation and rollout-generation system removed.
Fixed a bug where train/approx_entropy would be include entropy from masked-off tokens with no generation logprobs, causing NaNs to appear.
Fixed crash affecting sequence packing when responses are truncated.
Trust vLLM's tokenization over HuggingFace's, avoiding some off-policy training.
Required HF checkpointing on systems where DCP checkpointing would fail due to PCIe comms issues.

Notability

notability 4.0/10

Minor release candidate of an existing project, limited excitement