What does this repo signal mean?

Anthropic published anthropics/defending-code-reference-harness (Python). This repository signal exposes tooling, eval, infrastructure, or model-adjacent work before it may appear in a launch post. High-signal details: repo anthropics/defending-code-reference-harness · language Python · High stars and HN points for Anthropic code security work. onlylabs links this event to 1 captured evidence page and 6 related repo signals.

Anthropic Repo: anthropics/defending-code-reference-harness

Captured source

source ↗

GitHub/github.com/anthropics/defending-code-reference-harness

anthropics/defending-code-reference-harness repository metadata

Source ↗

published May 22, 2026seen Jun 5captured Jun 11http 200method plain

anthropics/defending-code-reference-harness

Description: Skills for threat modeling, scanning, triage, patching, plus an autonomous scanning harness you can /customize

Language: Python

License: NOASSERTION

Stars: 5683

Forks: 407

Open issues: 6

Created: 2026-05-22T16:00:56Z

Pushed: 2026-06-02T20:53:33Z

Default branch: main

Fork: no

Archived: no

README:

Defending Code Reference Harness

A reference implementation for autonomous vulnerability discovery and remediation with Claude, based on our learnings from partnering with security teams at several organizations since launching Claude Mythos Preview. For a write up of these learnings along with best practices, see the accompanying blog post (also available in [blog-post.md](docs/blog-post.md)). For a lightweight SDK-only walkthrough of the same recon → find → triage → report → patch loop, see the companion cookbook.

This repo is not maintained and is not accepting contributions.

> 🔒 Want a managed option? Anthropic offers > Claude Security, a hosted product > that finds and fixes vulnerabilities in your source code across multiple > projects. Claude Security scans your repository for vulnerabilities, > applies a multi-stage verification pipeline to reduce false positives, and > lets you manage findings through their lifecycle: triage, fix validation, > and rapid fix generation. > > This repository is an open-source reference implementation based on general > best practices for finding vulnerabilities using Claude. You can use it to > build your own vulnerability finding pipeline, customize the logic, and it > can be used with whatever access you have to Claude APIs (including > Bedrock, Vertex, or Azure).

Claude Code skills: /quickstart, /threat-model, /vuln-scan,

/triage, /patch, /customize: interactive scoping, scanning, triage, and patching. Open this repo in Claude Code and run /quickstart to get oriented.

`harness/`: the autonomous reference pipeline (recon → find → verify

→ report → patch), configured for finding C/C++ memory vulnerabilities using Docker and ASAN. This harness is a reference, not a product. The general shape, prompts, and sandboxing are reusable, but the harness will not work on every codebase out of the box. Run /customize to port it to your language, detector, or vuln class.

> ⚠️ Security: /quickstart, /threat-model, /vuln-scan, and /triage > only read and write files. Running /patch on static findings (TRIAGE.json > or VULN-FINDINGS.json) is likewise read- and write-only. /customize edits > the harness code and runs validation commands. Any of these skills are safe to > run unsandboxed, as long as you review and approve each tool use in Claude Code. > The autonomous reference pipeline (including /patch on pipeline results) > executes target code, so it refuses to run outside of a gVisor sandbox > unless explicitly overridden. To get set up, run scripts/setup_sandbox.sh once, > then invoke the pipeline via bin/vp-sandboxed. See [docs/security.md](docs/security.md) > and [docs/agent-sandbox.md](docs/agent-sandbox.md) for more details.

Getting Started

git clone https://github.com/anthropics/defending-code-reference-harness
cd defending-code-reference-harness
claude

# 30-sec intro + guided first run on the canary target
> /quickstart

> /quickstart how do I port the pipeline to Java?
> /quickstart how do I triage all these bugs?

Ramp Up

The most successful security teams we've partnered with are those that have gotten hands-on the fastest. Though it's tempting to spend months designing the perfect pipeline, we recommend starting small on Day 1 and building from there as learnings come. The steps below follow that pattern and set an ambitious (but reasonable) pace based on what we've seen.

| | | | |-------------------------------------------------------------------------------------|--------------|--------------------------------------------------------------| | [Step 1](#step-1-day-1-build-a-threat-model-and-run-your-first-static-scan--triage) | Day 1 | Build a threat model and run your first static scan + triage | | [Step 2](#step-2-day-2-run-the-reference-pipeline-on-a-cc-library) | Day 2 | Run the reference pipeline on a C/C++ library | | [Step 3](#step-3-days-3-5-customize-the-pipeline-for-your-target) | Days 3-5 | Customize the pipeline for your target | | [Step 4](#step-4-week-2-start-autonomous-scanning-triage-and-patching) | Week 2 | Start autonomous scanning, triage, and patching |

Step 1 (Day 1): Build a threat model and run your first static scan + triage

Day 1 is focused on seeing the whole loop end-to-end. Using only the interactive skills, you'll build a threat model, run a static scan scoped by it, triage what comes back, and draft candidate fixes. You'll finish the day with a threat model, a ranked list of static findings, and candidate patches.

The relevant skills only read and write files in your repo. As long as you run Claude Code interactively and approve each tool use, no sandbox is needed.

# Pin every subagent to the model you want
export CLAUDE_CODE_SUBAGENT_MODEL=
claude

# 0. intro + guided first run
> /quickstart

# 1. Build a threat model (aim before you shoot)
> /threat-model bootstrap targets/canary

# 2. Run a static scan, scoped by that threat model
> /vuln-scan targets/canary

# 3. Verify, dedupe, and rank what came back...

Excerpt shown — open the source for the full document.

Notability

notability 8.0/10

High stars and HN points for Anthropic code security work

anthropics/defending-code-reference-harness

Captured source

anthropics/defending-code-reference-harness

Defending Code Reference Harness

Contents

Getting Started

Further Reading

Ramp Up

Step 1 (Day 1): Build a threat model and run your first static scan + triage

Notability