What does this repo signal mean?

Tencent Hunyuan published Tencent-Hunyuan/GradLoc (Python). This repository signal exposes tooling, eval, infrastructure, or model-adjacent work before it may appear in a launch post. High-signal details: repo Tencent-Hunyuan/GradLoc · language Python · New repo, low traction.. onlylabs links this event to 1 captured evidence page and 6 related repo signals.

Tencent Hunyuan Repo: Tencent-Hunyuan/GradLoc

Captured source

source ↗

GitHub/github.com/Tencent-Hunyuan/GradLoc

Tencent-Hunyuan/GradLoc repository metadata

Source ↗

published Feb 10, 2026seen Jun 5captured Jun 11http 200method plain

Tencent-Hunyuan/GradLoc

Description: Implementation of GradLoc from the Tencent Hunyuan blog "Stabilizing RLVR via Token-level Gradient Diagnosis and Layerwise Clipping".

Language: Python

License: NOASSERTION

Stars: 99

Forks: 13

Open issues: 1

Created: 2026-02-10T06:12:48Z

Pushed: 2026-02-16T09:17:25Z

Default branch: main

Fork: no

Archived: no

README:

🔍 Overview

This repository implements the GradLoc part from our blog on RLVR training collapse diagnosis and stabilization.

The current release focuses on the GradLoc demo patch:

GradLoc: localizes gradient spikes to exact culprit tokens with distributed binary search (O(log N)).

![GradLoc framework](./assets/framework.png) *Figure 2. GradLoc localization path: global -> micro-batch -> rank -> token, with adaptive thresholds.*

This repo is intentionally lightweight and patch-oriented, so you can directly apply changes to upstream verl and reproduce experiments. We plan to further package GradLoc as a cleaner, configurable feature with better veRL integration and upstream-merge readiness in future releases.

The following arguments in run_experiment.sh are the core runtime knobs for GradLoc. They control trigger sensitivity, search budget, and dump path.

actor_rollout_ref.actor.grad_norm_threshold=640.0 \ # Spike trigger threshold for token-level grad norm
actor_rollout_ref.actor.bisect_budget_steps=128 \ # Max binary-search budget (forward/backward probes)
actor_rollout_ref.actor.bisect_dump_dir="${CKPTS_DIR}/bisect_dump" \ # Output dir for localization artifacts

🧩 Base commit

Upstream: verl
Commit: f9c855f7cf04d603c9546bc01776c74806a879c1

📦 Files changed by this patch

verl/trainer/ppo/ray_trainer.py
verl/utils/reward_score/__init__.py
verl/utils/reward_score/math_verify.py
verl/workers/actor/dp_actor.py

⚡ Quick start (online patch)

1) Clone upstream verl and checkout the base commit:

git clone https://github.com/volcengine/verl.git
cd verl && git checkout f9c855f7cf04d603c9546bc01776c74806a879c1

2) Apply patch from URL:

python /path/to/GradLoc-Patch/apply_patch.py --repo /path/to/verl --patch-url --sha256-file

💾 Local patch (offline)

If patches/gradloc.patch is already available locally:

python /path/to/GradLoc-Patch/apply_patch.py --repo /path/to/verl --patch-file /path/to/GradLoc-Patch/patches/gradloc.patch

🧪 Run experiment

bash /path/to/GradLoc-Patch/run_experiment.sh

🛠️ Regenerate patch after development

When code is modified on top of the base commit, regenerate the patch with:

bash /path/to/GradLoc-Patch/make_patch.sh --repo /path/to/verl

This rewrites patches/gradloc.patch from: git diff

📬 Contact Us

Guanhua Huang: carlan0974@gmail.com
Tingqiang Xu: xtq23@mails.tsinghua.edu.cn
Jinbo Wang: wangjinbo@stu.pku.edu.cn (wangjinbo@ustc.edu for long-term contact)

📚 Citation

If you find this project useful, please cite:

@misc{huang-xu-wang-2026-gradloc,
title = {Stabilizing RLVR via Token-level Gradient Diagnosis and Layerwise Clipping},
author = {Huang, Guanhua and Xu, Tingqiang and Wang, Jinbo and Sheng, Guangming and Li, Siheng and Yang, Evander and Li, Kejiao and Li, Yunxiang and Xu, Zenan and Yi, Qi and Gong, Xue and Nan, Ziyuan and Jiang, Yuhao and Zhang, Chenchen and Wu, Taiqiang and Zhang, Feiyuan and Wang, Junhao and Zhou, Bo and Chen, Alex and Wang, Di and Yao, Shunyu},
year = {2026},
url = {https://hy.tencent.com/research/100015}
}

❓ TBD

Excerpt shown — open the source for the full document.

Notability

notability 4.0/10

New repo, low traction.