What does this repo signal mean?

Zhipu AI (GLM) published zai-org/GLM-5. This repository signal exposes tooling, eval, infrastructure, or model-adjacent work before it may appear in a launch post. High-signal details: repo zai-org/GLM-5 · Notable model release with good traction.. onlylabs links this event to 1 captured evidence page and 6 related repo signals.

Zhipu AI (GLM) Repo: zai-org/GLM-5

Captured source

source ↗

GitHub/github.com/zai-org/GLM-5

zai-org/GLM-5 repository metadata

Source ↗

published Feb 9, 2026seen Jun 5captured Jun 11http 200method plain

zai-org/GLM-5

Description: GLM-5: From Vibe Coding to Agentic Engineering

License: Apache-2.0

Stars: 3387

Forks: 372

Open issues: 32

Created: 2026-02-09T08:17:02Z

Pushed: 2026-05-15T05:06:07Z

Default branch: main

Fork: no

Archived: no

README:

GLM-5.1 & GLM-5

👋 Join our Wechat or Discord community.

📖 Check out the GLM-5.1 blog and GLM-5 Technical report.

📍 Use GLM-5.1 API services on Z.ai API Platform.

🔜 GLM-5.1 will be available on chat.z.ai in the coming days.

Introduction

GLM-5.1

GLM-5.1 is our next-generation flagship model for agentic engineering, with significantly stronger coding capabilities than its predecessor. It achieves state-of-the-art performance on SWE-Bench Pro and leads GLM-5 by a wide margin on NL2Repo (repo generation) and Terminal-Bench 2.0 (real-world terminal tasks).

![bench_51](resources/bench_51.png)

But the most meaningful leap goes beyond first-pass performance. Previous models—including GLM-5—tend to exhaust their repertoire early: they apply familiar techniques for quick initial gains, then plateau. Giving them more time doesn't help.

GLM-5.1, by contrast, is built to stay effective on agentic tasks over much longer horizons. We've found that the model handles ambiguous problems with better judgment and stays productive over longer sessions. It breaks complex problems down, runs experiments, reads results, and identifies blockers with real precision. By revisiting its reasoning and revising its strategy through repeated iteration, GLM-5.1 sustains optimization over hundreds of rounds and thousands of tool calls. The longer it runs, the better the result.

GLM-5

We are launching GLM-5, targeting complex systems engineering and long-horizon agentic tasks. Scaling is still one of the most important ways to improve the intelligence efficiency of Artificial General Intelligence (AGI). Compared to GLM-4.5, GLM-5 scales from 355B parameters (32B active) to 744B parameters (40B active), and increases pre-training data from 23T to 28.5T tokens. GLM-5 also integrates DeepSeek Sparse Attention (DSA), largely reducing deployment cost while preserving long-context capacity.

Reinforcement learning aims to bridge the gap between competence and excellence in pre-trained models. However, deploying it at scale for LLMs is a challenge due to the RL training inefficiency. To this end, we developed slime, a novel asynchronous RL infrastructure that substantially improves training throughput and efficiency, enabling more fine-grained post-training iterations. With advances in both pre-training and post-training, GLM-5 delivers significant improvement compared to GLM-4.7 across a wide range of academic benchmarks and achieves best-in-class performance among all open-source models in the world on reasoning, coding, and agentic tasks, closing the gap with frontier models.

![bench](resources/bench.png)

GLM-5 is purpose-built for complex systems engineering and long-horizon agentic tasks. On our internal evaluation suite CC-Bench-V2, GLM-5 significantly outperforms GLM-4.7 across frontend, backend, and long-horizon tasks, narrowing the gap to Claude Opus 4.5.

![realworld_bench](resources/realworld_bench.png)

On Vending Bench 2, a benchmark that measures long-term operational capability, GLM-5 ranks \#1 among open-source models. Vending Bench 2 requires the model to run a simulated vending machine business over a one-year horizon; GLM-5 finishes with a final account balance of $4,432, approaching Claude Opus 4.5 and demonstrating strong long-term planning and resource management.

![vending_bench](resources/vending_bench.png)

Download Model

| Model | Download Links | Model Size | Precision | |-------------|-------------------------------------------------------------------------------------------------------------------------------------|------------|-----------| | GLM-5.1 | 🤗 Hugging Face 🤖 ModelScope | 744B-A40B | BF16 | | GLM-5.1-FP8 | 🤗 Hugging Face 🤖 ModelScope | 744B-A40B | FP8 | | GLM-5 | 🤗 Hugging Face 🤖 ModelScope | 744B-A40B | BF16 | | GLM-5-FP8 | 🤗 Hugging Face 🤖 ModelScope | 744B-A40B | FP8 |

Serve GLM-5 Series Locally

Prepare environment

vLLM, SGLang, xLLM and Ktransformers all support local deployment of GLM-5 series model, A simple deployment guide is provided here.

+ vLLM

Using Docker as:

docker pull vllm/vllm-openai:v0.20.2-cu129
docker pull vllm/vllm-openai:v0.20.2 # For CUDA 13.0

+ SGLang

Using Docker as:

docker pull lmsysorg/sglang:v0.5.11
docker pull lmsysorg/sglang:v0.5.11-cu130 # For CUDA 13.0

Deploy

+ vLLM

vllm serve zai-org/GLM-5.1-FP8 \
--tensor-parallel-size 8 \
--gpu-memory-utilization 0.85 \
--speculative-config.method mtp \
--speculative-config.num_speculative_tokens 3 \
--tool-call-parser glm47 \
--reasoning-parser glm45 \
--enable-auto-tool-choice \
--chat-template-content-format=string \
--served-model-name glm-5.1-fp8

Check the recipes for more details. >Note: When encounter Tool Call Parse issue with MTP enabled, please turn to vllm main branch to serve GLM-5.1.

+ SGLang

sglang serve \
--model-path zai-org/GLM-5.1-FP8 \
--tp-size 8 \
--tool-call-parser glm47 \
--reasoning-parser glm45 \
--speculative-algorithm EAGLE \
--speculative-num-steps 3 \
--speculative-eagle-topk 1 \
--speculative-num-draft-tokens 4 \
--mem-fraction-static 0.85 \
--served-model-name glm-5.1-fp8 \
--port 8000 \
--host 0.0.0.0

Check the sglang cookbook for more details.

+ xLLM

Please check the deployment guide here.

+ Ktransformers

Please check the deployment guide here.

Citation

If you find GLM-5 series model useful in your research, please cite our technical report:

Excerpt shown — open the source for the full document.

Notability

notability 7.0/10

Notable model release with good traction.