novitalabs/rllm
forked from rllm-org/rllm
Captured source
source ↗novitalabs/rllm
Description: Democratizing Reinforcement Learning for LLMs
License: Apache-2.0
Stars: 0
Forks: 0
Open issues: 0
Created: 2026-01-14T08:23:34Z
Pushed: 2026-03-09T09:24:47Z
Default branch: main
Fork: yes
Parent repository: rllm-org/rllm
Archived: no
README:
rLLM is an open-source framework for post-training language agents via reinforcement learning. With rLLM, you can easily build your custom agents and environments, train them with reinforcement learning, and deploy them for real-world workloads.
Releases 📰
[2025/12/11] We release rLLM v0.2.1 which comes with support for Tinker backend, LoRA and VLM training, and support for Eval Protocol. We also bumped our verl backend to v0.6.1. [[SDK Blogpost]](https://rllm-project.com/post.html?post=sdk.md)
[2025/10/16] rLLM v0.2 is now officially released! We introduce AgentWorkflowEngine for training over arbitrary agentic programs. It also comes integrated with the official verl-0.5.0, featuring support for Megatron training. Check out this blog post for more.
[2025/07/01] We release `DeepSWE-Preview`, a 32B software engineering agent (SWE) trained with purely RL that achieves 59% on SWEBench-Verified with test-time scaling,(42.2% Pass@1), topping the SWEBench leaderboard for open-weight models.
[2025/04/08] We release `DeepCoder-14B-Preview`, a 14B coding model that achieves an impressive 60.6% Pass@1 accuracy on LiveCodeBench (+8% improvement), matching the performance of o3-mini-2025-01-031 (Low) and o1-2024-12-17.
[2025/02/10] We release `DeepScaleR-1.5B-Preview`, a 1.5B model that surpasses O1-Preview and achieves 43.1% Pass@1 on AIME. We achieve this by iteratively scaling Deepseek's GRPO algorithm from 8K→16K->24K context length for thinking.
Getting Started 🎯
rLLM requires Python >= 3.10 (3.11 is needed if using tinker). You can install it either directly via pip or build from source.
There are three ways that you can install rLLM:
Approach A: Direct Installation
uv pip install "rllm[verl] @ git+https://github.com/rllm-org/rllm.git"
_(or replace the verl above for tinker to install with tinker backend, see below for more details)_
Approach B: Building from Source with uv
Step 1: Clone and Setup Environment
# Clone the repository git clone https://github.com/rllm-org/rllm.git cd rllm # Create an uv environment uv venv --python 3.11 source .venv/bin/activate
Step 2: Install rLLM with Training Backend
rLLM supports two training backends: verl and tinker. Choose one based on your needs.
_Option I: Using verl as Training Backend_
uv pip install -e .[verl]
_Option II: Using tinker as Training Backend_
# can add --torch-backend=cpu to train on CPU-only machines uv pip install -e .[tinker]
Approach C: Installation with Docker 🐳
For a containerized setup, you can use Docker:
# Build the Docker image docker build -t rllm . # Create and start the container docker create --runtime=nvidia --gpus all --net=host --shm-size="10g" --cap-add=SYS_ADMIN -v .:/workspace/rllm -v /tmp:/tmp --name rllm-container rllm sleep infinity docker start rllm-container # Enter the container docker exec -it rllm-container bash
For more detailed installation guide, including using sglang for verl backend, please refer to our documentation.
Awesome Projects using rLLM 🔥
- DeepScaleR: Surpassing O1-Preview with a 1.5B Model by Scaling RL
- DeepCoder: A Fully Open-Source 14B Coder at O3-mini Level
- DeepSWE: Training a Fully Open-sourced, State-of-the-Art Coding Agent by Scaling RL
- Cogito, Ergo Ludo: An Agent that Learns to Play by Reasoning and Planning
- Cut the Bill, Keep the Turns: Affordable Multi-Turn Search RL
Acknowledgements
Our work is done as part of Berkeley Sky Computing Lab. The rLLM team is generously supported by grants from Laude Institute, AWS, Hyperbolic, Fireworks AI, and Modal. We pay special thanks to Together AI for the research partnership and compute support.
Citation
@misc{rllm2025,
title={rLLM: A Framework for Post-Training Language Agents},
author={Sijun Tan and Michael Luo and Colin Cai and Tarun Venkat and Kyle Montgomery and Aaron Hao and Tianhao Wu and Arnav Balyan and Manan Roongta and Chenguang Wang and Li Erran Li and Raluca Ada Popa and Ion Stoica},
year={2025},
howpublished={\url{https://pretty-radio-b75.notion.site/rLLM-A-Framework-for-Post-Training-Language-Agents-21b81902c146819db63cd98a54ba5f31}},
note={Notion Blog},
year={2025}
}You may also cite our prior work DeepScaleR,…
Excerpt shown — open the source for the full document.
Notability
notability 2.0/10Routine fork with no traction