RepoGroqGroqpublished Oct 15, 2025seen 5d

groq/openbench-cyber

Python

Open original ↗

Captured source

source ↗
published Oct 15, 2025seen 5dcaptured 13hhttp 200method plain

groq/openbench-cyber

Description: Cybersecurity evals plugin for openbench

Language: Python

Stars: 1

Forks: 0

Open issues: 8

Created: 2025-10-15T22:25:49Z

Pushed: 2026-02-10T23:33:21Z

Default branch: main

Fork: no

Archived: no

README:

openbench-cyber

Cybersecurity evaluation plugin for openbench.

This package moves all cybersecurity-heavy benchmarks (CTI-Bench + CyBench) into a separate optional dependency so that the core openbench distribution stays lean while still supporting advanced security workloads.

Installation

Install directly from Git or rely on the optional extra exposed by openbench:

uv pip install "openbench-cyber @ git+https://github.com/groq/openbench-cyber.git@main"

# or pull it in automatically via the optional extra
uv pip install "openbench[cyber]"

After installation, the new benchmarks automatically appear in bench list because they are registered through openbench's entry-point based plugin system.

Included Benchmarks

  • cti_bench_ate – MITRE ATT&CK technique extraction
  • cti_bench_mcq – CTI multiple-choice knowledge assessments
  • cti_bench_rcm – CVE→CWE vulnerability classification
  • cti_bench_vsp – CVSS severity regression
  • cybench – Agentic CTF-style challenges powered by inspect-cyber

Run them exactly like any other benchmark:

bench eval cti_bench_vsp --model groq/llama-3.3-70b-versatile
bench eval cybench --env CYBENCH_ACKNOWLEDGE_RISKS=1

Development

uv venv
source .venv/bin/activate
uv pip install -e ".[dev]"
pre-commit install

The repository mirrors openbench's release automation (Release Please + PyPI publish) so that security benchmarks can ship independently from the main project.

Notability

notability 1.0/10

Low stars, trivial repo