What does this fork signal mean?

Nous Research forked NousResearch/RL (forked from NVIDIA-NeMo/RL). This fork signal points to upstream code the lab may be inspecting, patching, or building on. High-signal details: repo NousResearch/RL · parent NVIDIA-NeMo/RL · Reinforcement learning research from the Nous Research collective.. onlylabs links this event to 1 captured evidence page and 6 related fork signals.

Nous Research Fork: NousResearch/RL

Captured source

source ↗

GitHub/github.com/NousResearch/RL

NousResearch/RL repository metadata

Source ↗

published Apr 3, 2026seen Jun 6captured Jun 11http 200method plain

NousResearch/RL

Description: Scalable toolkit for efficient model reinforcement

License: Apache-2.0

Stars: 11

Forks: 2

Open issues: 1

Created: 2026-04-03T14:35:01Z

Pushed: 2026-06-04T18:36:40Z

Default branch: main

Fork: yes

Parent repository: NVIDIA-NeMo/RL

Archived: no

README:

📣 News

[03/12/2026] GDPO Support
Enabling Group reward-Decoupled Normalization Policy Optimization (GDPO) for multi-reward RL training is now supported.
Example: [gdpo_math_1B.yaml](/examples/configs/gdpo_math_1B.yaml)
Support Async RL training
WIP: Nemo-gym compatibility
[03/11/2026] Nemotron-3-Super was post-trained with NeMo-RL! Follow this guide to reproduce the full RL training recipe.
[02/04/2026] LoRA Support
LoRA SFT is supported on both DTensor and Megatron Core backends.
LoRA GRPO is supported on both DTensor and Megatron Core backends.
LoRA DPO is supported on both DTensor and Megatron Core backends.
Nano v3 LoRA recipes:
[sft-nanov3-30BA3B-2n8g-fsdp2-lora.yaml](examples/configs/recipes/llm/sft-nanov3-30BA3B-2n8g-fsdp2-lora.yaml)
[grpo-nanov3-30BA3B-2n8g-fsdp2-lora.yaml](examples/configs/recipes/llm/grpo-nanov3-30BA3B-2n8g-fsdp2-lora.yaml)
[grpo-nanov3-30BA3B-2n8g-megatron-lora.yaml](examples/configs/recipes/llm/grpo-nanov3-30BA3B-2n8g-megatron-lora.yaml)
[01/30/2026] Release v0.5.0!
Both linux/amd64 and linux/arm64 Docker containers are available on NGC nvcr.io/nvidia/nemo-rl:v0.5.0.
NeMo-Gym + NeMo-RL support
📊 View the release run metrics on Google Colab to get a head start on your experimentation.

Previous News

[12/15/2025] NeMo-RL is the framework that trained NVIDIA-NeMotron-3-Nano-30B-A3B-FP8! [This guide](docs/guides/nemotron-3-nano.md) provides reproducible instructions for the post-training process.
[10/10/2025] DAPO Algorithm Support

NeMo RL now supports Decoupled Clip and Dynamic Sampling Policy Optimization (DAPO) algorithm that extends GRPO with Clip-Higher, Dynamic Sampling, Token-Level Policy Gradient Loss, and Overlong Reward Shaping for more stable and efficient RL training. See the [DAPO guide](docs/guides/dapo.md) for more details.

[9/27/2025] FP8 Quantization in NeMo RL
[9/25/2025] On-policy Distillation
Student generates on-policy sequences and aligns logits to a larger teacher via KL, achieving near-larger-model quality at lower cost than RL. See [On-policy Distillation](#on-policy-distillation).
[12/1/2025] Release v0.4.0!
First release with official NGC Container nvcr.io/nvidia/nemo-rl:v0.4.0.
📊 View the release run metrics on Google Colab to get a head start on your experimentation.
[9/30/2025] Accelerated RL on GCP with NeMo RL!
[8/15/2025] NeMo-RL: Journey of Optimizing Weight Transfer in Large MoE Models by 10x
[7/31/2025] NeMo-RL V0.3: Scalable and Performant Post-training with Nemo-RL via Megatron-Core
[7/25/2025] Release v0.3.0!
📝 v0.3.0 Announcement
📊 View the release run metrics on Google Colab to get a head start on your experimentation.

[5/14/2025] [Reproduce DeepscaleR with NeMo RL!](docs/guides/grpo-deepscaler.md)
[5/14/2025] Release v0.2.1!
📊 View the release run metrics on Google Colab to get a head start on your experimentation.

Overview

NeMo RL is an open-source post-training library under the NVIDIA NeMo Framework, designed to streamline and scale reinforcement learning methods for multimodal models (LLMs, VLMs etc.). Designed for flexibility, reproducibility, and scale, NeMo RL enables both small-scale experiments and massive multi-GPU, multi-node deployments for fast experimentation in research and production environments.

!NeMo RL Architecture Diagram

What you can expect:

Flexibility with a modular design that allows easy integration and customization.
Efficient resource management using Ray, enabling scalable and flexible deployment across different hardware configurations.
Hackable with native PyTorch-only paths for quick research prototypes.
High performance with Megatron Core, supporting various parallelism techniques for large models and large context lengths.
Seamless integration with Hugging Face for ease of use, allowing users to leverage a wide range of pre-trained models and tools.
Comprehensive documentation that is both detailed and user-friendly, with practical examples.

Please refer to our design documents for more details on the architecture and design philosophy.

Training Backends

NeMo RL supports multiple training backends to accommodate different model sizes and hardware configurations:

DTensor - PyTorch's next-generation distributed training with improved memory...

Excerpt shown — open the source for the full document.

Notability

notability 2.0/10

low stars routine fork