RepoNous ResearchNous Researchpublished Apr 29, 2025seen 5d

NousResearch/atropos

Python

Open original ↗

Captured source

source ↗
published Apr 29, 2025seen 5dcaptured 16hhttp 200method plain

NousResearch/atropos

Description: Atropos is a Language Model Reinforcement Learning Environments framework for collecting and evaluating LLM trajectories through diverse environments

Language: Python

License: MIT

Stars: 1273

Forks: 367

Open issues: 82

Created: 2025-04-29T19:02:06Z

Pushed: 2026-06-08T16:41:48Z

Default branch: main

Fork: no

Archived: no

README:

Atropos - Nous Research's LLM RL Gym

![newatr-02](banner-image.jpg)

---

What is Atropos?

Atropos is an environment microservice framework for async RL with LLMs.

Atropos encompasses both environments, which are set up as services, and a trajectory API for the environments to send data to and for the trainer to pull batches from.

!image

Atropos is a robust, scalable framework for Reinforcement Learning Environments with LLMs.

The goal: provide a flexible, scalable, and standardized platform to accelerate LLM-based RL research across diverse, interactive settings.

The framework supports collecting, distributing and evaluating LLM trajectories through diverse environments including:

---

Experimental results from models trained using Atropos' environments

We have been able to achieve significant improvements on specific domains or tasks with Atropos - Below are some of the results.

Tool Calling Environment Results:

Model Artifact: https://huggingface.co/NousResearch/DeepHermes-ToolCalling-Specialist-Atropos

Environment Used: https://github.com/NousResearch/atropos/blob/main/environments/tool_calling_server.py

---

Financial Fundamentals Prediction Environment Results:

Model Artifact: https://huggingface.co/NousResearch/DeepHermes-Financial-Fundamentals-Prediction-Specialist-Atropos

Environment Used: https://github.com/NousResearch/atropos/blob/main/environments/fundamental_prediction_environment.py

---

RLAIF Experiment Artifacts

Using the RLAIF Environment to change the personality of the model, we have produced several artifacts of interesting and weird personalities.

DeepHermes Egregore v1 and v2 8B:

https://huggingface.co/NousResearch/DeepHermes-Egregore-v1-RLAIF-8b-Atropos https://huggingface.co/NousResearch/DeepHermes-Egregore-v2-RLAIF-8b-Atropos

DeepHermes Ascension Maze 8B:

https://huggingface.co/NousResearch/DeepHermes-AscensionMaze-RLAIF-8b-Atropos

Environment Used: https://github.com/NousResearch/atropos/blob/main/environments/rlaif_server.py

---

Navigating the Repo

| Category | Description | |----------|------------| | 📁 [atroposlib/](atroposlib/) | Core library containing base classes and utilities | | 🎮 [environments/](environments/) | Collection of ready-to-use RL environments. Community contributions are typically placed in the [environments/community/](environments/community/) subdirectory. | | 📚 [example_trainer/](example_trainer/) | Example training scripts and configurations |

Key Documents:

  • [Base Environment Class](atroposlib/envs/README.md) - Documentation for creating custom environments
  • [ManagedServer Guide](atroposlib/envs/server_handling/MANAGED_SERVER.md) - Recommended approach for automatic token and logprob tracking
  • [Environments Overview and Contribution Guide](environments/community/README.md) - Documentation for existing environments and how to contribute new ones.
  • [Full Environment Config Options](CONFIG.md) - Documentation for creating custom environments
  • [Example Trainer](example_trainer/README.md) - Getting started with training
  • [Slurm Guide](SLURM.md) - Guide for using Atropos with Slurm for distributed inference
  • [Frequently Asked Questions (FAQ)](atroposlib/FAQ.md) - Answers to common questions for new users
  • [Contributing Guide](CONTRIBUTING.md) - Guidelines for contributors
  • [License](LICENSE) - MIT license details

---

Prerequisites

Before installing Atropos, ensure you have the following:

  • Python 3.10+ — Required. Check with python --version
  • Git — For cloning the repository
  • An OpenAI-compatible API endpoint — Atropos environments need an inference server. Options include:
  • A local vLLM or SGLang instance
  • An OpenAI API key (set as OPENAI_API_KEY environment variable)
  • Any provider with an OpenAI-compatible endpoint (e.g., Together AI, OpenRouter)
  • Weights & Biases account *(optional)* — For experiment tracking. Set use_wandb=False in your environment config to skip

> Note: You do not need a GPU to develop or test environments locally. A GPU is only required for running inference servers locally or for training.

---

Installation

Get your Python 3.10 (or later) environment ready, then simply pip install:

pip install atroposlib

If you're looking to get into developing the repo or using the environments:

pip install -e . # for using
pip install -e .[dev] # for development
pip install -e .[examples] # for running examples
pip install -e .[verifiers] # for verifiers integration
pip install -e .[all] # for everything

Important: If you're committing to the repository, please install the pre-commit hooks:

pre-commit install

---

Quick Start Guide

1. Create Your First Environment

  • Review our [Base Class Documentation](atroposlib/envs/README.md) to understand the core concepts
  • Check out existing environments in the [environments/](environments) directory for examples

2. Run an Example Environment

You should edit the config_init section of the environment file you want (For example, in GSM8K Environment) to point to a running VLLM or SGLang inference server as well as any other [configuration changes](CONFIG.md) you'd like to make, such as the group size, then:

> Note: By default, Atropos uses the OpenAI-compatible API endpoint which works with any provider. For enhanced features, use VLLMServer (atroposlib/envs/server_handling/vllm_server.py) or SGLangServer

Excerpt shown — open the source for the full document.

Notability

notability 5.0/10

Solid new repo with decent traction