PaddlePaddle/PassNet
Captured source
source ↗PaddlePaddle/PassNet
Stars: 13
Forks: 15
Open issues: 4
Created: 2025-07-31T12:10:28Z
Pushed: 2026-06-01T02:14:08Z
Default branch: develop
Fork: no
Archived: no
README:
PassNet
PassNet is an AI system for compiler optimization that leverages LLM-driven agents to automatically generate high-performance GPU kernels through compiler pass mechanisms for computation graph optimization. PassNet includes a complete optimization toolchain, the PassBench evaluation benchmark, and the PassAgent agent evaluation framework.
English | [中文](README_cn.md)
Links
- Paper: arXiv:2605.29357
- Dataset: PassNet on HuggingFace
- Leaderboard: PassBench Leaderboard
Table of Contents
- [Project Structure](#project-structure)
- [Architecture Overview](#architecture-overview)
- [Core Components](#core-components)
- [DataSet](#dataset)
- [Quick Start](#quick-start)
- [PassBench Evaluation Pipeline](#passbench-evaluation-pipeline)
- [PassAgent Evaluation](#passagent-evaluation)
- [License](#license)
Project Structure
PassNet/ ├── pass_bench/ # PassBench compiler evaluation framework: kernel compilation, correctness verification, performance benchmarking ├── pass_agent/ # PassAgent evaluation framework ├── samples/ # PassBench sample data ├── sample_lists/ # PassBench sample list files (eval/train splits) ├── entry_scripts/ # Evaluation entry scripts ├── graphs/ # Subgraph data ├── graph_lists/ # Subgraph lists and grouping info ├── test/ # Unit tests ├── Dockerfile.nvidia # Docker image definition └── requirements.txt # Python dependencies
Architecture Overview
┌─────────────────────────────────────────────────────────────────────────┐ │ PassAgent │ │ (LLM-driven Pass Generation) │ │ ┌─────────────────────────────────────────────────────────────────────┐ │◄───┐ │ │ Multi-step Iterative Solving · k-attempts · R2E-Gym Framework │ │ │ │ └─────────────────────────────────────────────────────────────────────┘ │ │ └────────────────┬───────────────────────────────────────┬────────────────┘ │ read data │ generated pass │ │ ▼ ▼ │ ┌───────────────────────────────────┐ ┌───────────────────────────────┐ │ │ DataSet │ │ PassBench │ │ │ ┌─────────────────────────────┐ │ │ ┌──────────────────────────┐ │ │ │ │ graphs/ │ │ │ │ 1. Execution & Eval │ │ │ │ │ sole_op (5,939) │ │ │ │ Eager Execution │ │ │ │ │ fusible (22,870) │ │ │ │ pass_mgr Execution │ │ │ │ │ typical (25,151) │ │ │ └────────────┬─────────────┘ │ │ │ └─────────────────────────────┘ │ │ │ │ │ │ ┌─────────────────────────────┐ │ │ ▼ │ feedback │ │ samples/ │ │ │ ┌──────────────────────────┐ │ │ │ │ sole_op (1,029) │ │ │ │ 2. Result Checking │ │ │ │ │ fusible (4,676) │ │ │ │ Correctness & Speedup │ │ │ │ │ typical (4,278) │ │ │ └────────────┬─────────────┘ │ │ │ └─────────────────────────────┘ │ │ │ │ │ │ ┌─────────────────────────────┐ │ │ ▼ │ │ │ │ sample_lists/ │ │ │ ┌──────────────────────────┐ │ │ │ │ train/ │ │ │ │ 3. Score Aggregation │ │ │ │ │ eval/ │ │ │ │ ES(t) & AS Met │ │ │ │ └─────────────────────────────┘ │ │ └──────────────────────────┘ │ │ └───────────────────────────────────┘ └───────────────────────────────┘ │ └─────────────────────┘
Core Components
[PassBench](pass_bench/) — Compiler Evaluation Framework
Provides kernel compilation, correctness verification, and performance benchmarking. It serves as both a standalone evaluation tool and the backend evaluation framework invoked by PassAgent:
- Kernel Compilation: Executes pass matching and replacement via the
pass_mgrcompiler method - Correctness Verification: Validates numerical correctness of optimized kernels against dtype-specific tolerance thresholds (float32 / float16 / bfloat16)
- Performance Benchmarking: Measures speedup over 100 trials and outputs
aggregated_score.json - Score Aggregation:
aggregate_es_scores.pycomputes ES(t) scores across all graphs in a sample
[PassAgent](pass_agent/) — R2E-Gym Agent Evaluation Framework
Evaluates agent capabilities for compiler optimization using the R2E-Gym framework. See [pass_agent/README.md](pass_agent/README.md) for details.
DataSet
graphs — Raw Subgraph Data
Stores raw computation subgraphs extracted from deep learning models, serving as the source for PassBench samples:
- fusible_subgraphs/: A small set of example fusible subgraphs (1,456), containing computation graphs with multi-operator fusion opportunities
- hf_subgraphs/ (Legacy): Previous version subgraph data, containing sole op (1,410), fusible (4,167), and typical (6,157) categories
- hf_subgraphs_v2/: HuggingFace model subgraphs, organized into three categories:
sole_op_subgraphs: Single-operator subgraphs (5,939)fusible_subgraphs: Fusible subgraphs (22,870)typical_subgraphs: Typical subgraphs (25,151)
graph_lists — Subgraph Lists and Grouping
Stores subgraph path lists, UID groupings, and other information for sample filtering and group management:
Subgraph Path Lists (line format: subgraph_UID\tsubgraph_relative_path)
| File | Subgraphs | Description | |------|-----------|-------------| | [fusible_subgraphs.txt](graph_lists/fusible_subgraphs.txt) | 1,455 | Example fusible subgraph paths | | [hf_sole_op_subgraphs.txt](graph_lists/hf_sole_op_subgraphs.txt) | 1,410 | Legacy sole op subgraph paths | | [hf_fusible_subgraphs.txt](graph_lists/hf_fusible_subgraphs.txt) | 4,166 | Legacy fusible subgraph paths | | [hf_typical_subgraphs.txt](graph_lists/hf_typical_subgraphs.txt) | 6,157 | Legacy typical subgraph paths | | [hf_sole_op_subgraphs_v2.txt](graph_lists/hf_sole_op_subgraphs_v2.txt) | 5,939 | v2 sole op subgraph paths | | [hf_fusible_subgraphs_v2.txt](graph_lists/hf_fusible_subgraphs_v2.txt) | 22,870 | v2 fusible subgraph paths | | [hf_typical_subgraphs_v2.txt](graph_lists/hf_typical_subgraphs_v2.txt) | 25,151 | v2 typical subgraph paths |
samples — PassBench Evaluation Samples
Evaluation samples generated from graphs/, each serving as an independently executable evaluation unit:
- fusible_subgraphs/: A small set of example samples from TIMM models' fusible subgraphs, organized by
model_name/subgraph_index - hf_subgraphs/ (Legacy): Previous version subgraph samples, containing sole op (590), fusible (2,489), and typical (3,382) categories
-…
Excerpt shown — open the source for the full document.
Notability
notability 1.0/10Low stars, routine repo