ForkNous ResearchNous Researchpublished Mar 31, 2026seen 5d

NousResearch/Gym

forked from NVIDIA-NeMo/Gym

Open original ↗

Captured source

source ↗
published Mar 31, 2026seen 5dcaptured 11hhttp 200method plain

NousResearch/Gym

Description: Build RL environments for LLM training

Language: Python

License: Apache-2.0

Stars: 17

Forks: 2

Open issues: 0

Created: 2026-03-31T01:31:07Z

Pushed: 2026-04-27T20:10:18Z

Default branch: main

Fork: yes

Parent repository: NVIDIA-NeMo/Gym

Archived: no

README:

NeMo Gym

[Requirements](#-requirements)[Quick Start](#-quick-start)[Available Environments](#-available-environments)[Documentation & Resources](#-documentation--resources)[Community & Support](#-community--support)[Citations](#-citations)

NeMo Gym is a library for building reinforcement learning (RL) training environments for large language models (LLMs). It provides infrastructure to develop environments, scale rollout collection, and integrate seamlessly with your preferred training framework.

🏆 Why NeMo Gym?

  • Scaffolding and patterns to accelerate environment development: multi-step, multi-turn, and user modeling scenarios
  • Contribute environments without expert knowledge of the entire RL training loop
  • Test environments and throughput end-to-end, independent of the RL training loop
  • Interoperable with existing environments, systems, and RL training frameworks
  • Growing collection of training environments and datasets for Reinforcement Learning from Verifiable Reward (RLVR)

> [!IMPORTANT] > NeMo Gym is currently in early development. You should expect evolving APIs, incomplete documentation, and occasional bugs. We welcome contributions and feedback - for any changes, please open an issue first to kick off discussion!

🔗 Ecosystem

NeMo Gym is part of NVIDIA NeMo, NVIDIA's GPU-accelerated platform for building and training generative AI models. NeMo Gym integrates with a growing number of RL training frameworks and environment libraries; see the Ecosystem page for full details and tutorials.

Training Frameworks: NeMo RLOpenRLHFUnslothmore →

Environment Libraries: Reasoning GymAviarymore →

📋 Requirements

NeMo Gym is designed to run on standard development machines:

| Hardware Requirements | Software Requirements | | --------------------- | --------------------- | | GPU: Not required for NeMo Gym library operation • GPU may be needed for specific resources servers or model inference (see individual server documentation) | Operating System: • Linux (Ubuntu 20.04+, or equivalent) • macOS (11.0+ for x86_64, 12.0+ for Apple Silicon) • Windows (via WSL2) | | CPU: Any modern x86_64 or ARM64 processor (e.g., Intel, AMD, Apple Silicon) | Python: 3.12 or higher | | RAM: Minimum 8 GB (16 GB+ recommended for larger environments) | Git: For cloning the repository | | Storage: Minimum 5 GB free disk space for installation and basic usage | Internet Connection: Required for downloading dependencies and API access |

Additional Requirements

  • API Keys: OpenAI API key with available credits (for the quickstart examples)
  • Other model providers supported (Azure OpenAI, self-hosted models via vLLM)
  • Ray: Automatically installed as a dependency (no separate setup required)

🚀 Quick Start

Install NeMo Gym, start the servers, and collect your first verified rollouts for RL training.

Setup

# Clone the repository
git clone git@github.com:NVIDIA-NeMo/Gym.git
cd Gym

# Install UV (Python package manager)
curl -LsSf https://astral.sh/uv/install.sh | sh
source $HOME/.local/bin/env

# Create virtual environment
uv venv --python 3.12
source .venv/bin/activate

# Install NeMo Gym
uv sync --extra dev --group docs

Configure Your API Key

Create an env.yaml file that contains your OpenAI API key and the policy model you want to use. Replace your-openai-api-key with your actual key. This file helps keep your secrets out of version control while still making them available to NeMo Gym.

echo "policy_base_url: https://api.openai.com/v1
policy_api_key: your-openai-api-key
policy_model_name: gpt-4.1-2025-04-14" > env.yaml

> [!NOTE] > We use GPT-4.1 in this quickstart because it provides low latency (no reasoning step) and works reliably out-of-the-box. NeMo Gym is not limited to OpenAI models—you can use self-hosted models via vLLM or any OpenAI-compatible inference server. See the documentation for details.

Start Servers

Terminal 1 (start servers):

# Start servers (this will keep running)
config_paths="resources_servers/example_single_tool_call/configs/example_single_tool_call.yaml,\
responses_api_models/openai_model/configs/openai_model.yaml"
ng_run "+config_paths=[${config_paths}]"

Terminal 2 (interact with agent):

# In a NEW terminal, activate environment
source .venv/bin/activate

# Interact with your agent
python responses_api_agents/simple_agent/client.py

Collect Rollouts

Terminal 2 (keep servers running in Terminal 1):

# Create a simple dataset with one query
echo '{"responses_create_params":{"input":[{"role":"developer","content":"You are a helpful assistant."},{"role":"user","content":"What is the weather in Seattle?"}]}}' > weather_query.jsonl

# Collect verified rollouts
ng_collect_rollouts \
+agent_name=example_single_tool_call_simple_agent \
+input_jsonl_fpath=weather_query.jsonl \
+output_jsonl_fpath=weather_rollouts.jsonl

# View the result
cat weather_rollouts.jsonl | python -m json.tool

This generates training data with verification scores!

Clean Up Servers

Terminal 1 with the running servers: Ctrl+C to stop the ng_run process.

###…

Excerpt shown — open the source for the full document.

Notability

notability 1.0/10

Low-star fork, routine activity