amazon-science/reskill
Python
Captured source
source ↗amazon-science/reskill
Description: An easy-to-configure and extensible veRL extension for agent RL training with skill co-evolution.
Language: Python
License: Apache-2.0
Stars: 6
Forks: 1
Open issues: 5
Created: 2026-06-04T02:13:35Z
Pushed: 2026-06-11T00:17:28Z
Default branch: main
Fork: no
Archived: no
README:
ReSkill
*An easy-to-configure, extensible veRL extension that brings the Anthropic Skill Creator into agentic RL training. Full control over skill versioning, sampling, bundle testing, and skill-policy co-evolution.*
Official code for the paper: ReSkill: Reconciling Skill Creation with Policy Optimization in Agentic RL.
---
🔥 News
- [2026-06] 🎉 Paper and codebase are now public. More are on the way... stay tracked!
---
🧩 System Overview
(a) Inspired by Anthropic's human-in-the-loop Skill Creator, ReSkill recasts skill creation as an RL-in-the-loop process. (b) Compared with decoupled skill-update methods, ReSkill exposes a highly configurable loop for jointly evolving skills and policies.
ReSkill combines three pieces:
- RL training with per-turn skill customization: veRL handles distributed RL, while
ReSkill follows the verl-agent design of decomposing multi-turn agent rollouts and adds skill loading into each turn.
- RL-in-the-loop skill creation: ReSkill adapts the structure of
Anthropic's skill creator into an RL feedback loop for analyzing rollout experience and proposing skill updates during training.
- Skill versioning and sampling: ReSkill tracks skill versions, loads active
skills, samples/testing skill bundles, and supports skill-policy co-evolution over training.
⚙️ Installation
git clone https://github.com/amazon-science/reskill.git cd reskill git submodule update --init --recursive verl pip install -e .
Install only the benchmark and backend extras you need:
pip install -e ".[,vllm]"
Validated stack pins are recorded under requirements/.
The current benchmark extras are alfworld, search, and scienceworld. Additional environment support will be added over time.
🚀 Usage
Prepare data for an environment:
python scripts/data_prep/prepare_.py --output_dir data/
Run training:
python scripts/train.py --config-name
Concrete configs live under configs/, and cluster launch examples live under scripts/launch/.
🛠️ Customize ReSkill
ReSkill is designed so both sides of the co-evolution loop can be customized.
- Policy side: customize the environment, rollout format, action projection,
rewards, group rollout settings, and backend profiles.
- Skill side: customize skill-generation prompts, trigger behavior, active
skill budgets, version testing/sampling, and skill library persistence.
📢 Release Note
> This codebase is under active restructuring and testing as we work toward a stable release. Thank you for your patience and interest!
🗺️ Roadmap
- Track newer veRL releases.
- Add SGLang rollout backend support.
- Add backend config profiles for vLLM and SGLang.
- Expand validated environment examples.
🙏 Acknowledgements
We thank the contributors to veRL, verl-agent, and Anthropic Skill Creator for their open-source foundations and inspiration, which ReSkill builds upon.
📄 License
Apache 2.0
📚 Citation
If you find this work helpful, please kindly consider citing our paper and starring the repository.
@article{he2026reskill,
title={ReSkill: Reconciling Skill Creation with Policy Optimization in Agentic RL},
author={He, Zelin and Lin, Haotian and Han, Boran and Zhu, Wei and Fang, Haoyang and Wang, Bernie and Zhu, Xuan and Li, Runze and Reimherr, Matthew},
journal={arXiv preprint arXiv:2606.01619},
year={2026}
}Notability
notability 1.0/10Low stars, new repo, not notable.