Safety & alignment at the frontier — what the hiring reveals
Charts
Safety & alignment at the frontier: what the hiring reveals
A safety person's read of 127 open safety/alignment/red-team roles across frontier labs (onlylabs, June 2026), for two audiences: people who want to get hired into safety, and people who want to sell safety tooling to the labs. Companion to the evals report, which covers the eval-harness slice; this is the wider safety org.
0. The macro read
127 open safety roles, and the surprise is the order: OpenAI 55 ≫ Anthropic 42 ≫ xAI 9 ≫ Google DeepMind 5. The safety-branded lab (Anthropic) is out-hired in raw safety headcount right now by OpenAI — though the shapes differ. OpenAI's safety hiring spreads across Alignment (Oversight, Training, Misalignment, Science), Preparedness (threat modeling, biosafety/cyber red-team, recursive-self-improvement), and Trust & Safety. Anthropic concentrates in a deep Safeguards org + Interpretability + Alignment Science + a Frontier Red Team.
The throughline: safety is now an engineering-and-ops discipline, not just research — Safeguards tooling, red-team infrastructure, monitoring, and dangerous-capability evals dominate over pure theory.
1. If you want to get hired (as a safety person)
- OpenAI (55) — three doors: alignment (Threat Modeler, Preparedness and the Alignment Oversight/Training/Misalignment roles), preparedness red-team (biosafety, cyber, recursive self-improvement — mapped to the Preparedness Framework), and interpretability. If you can red-team a dangerous capability or build an oversight monitor, you fit.
- Anthropic (42) — the Safeguards Labs org (research + eng + data + tooling), Interpretability (research scientists + engineers), Alignment Science, and the Frontier Red Team (Cyber). Anthropic wants safety people who ship infrastructure, not just write papers — see the eval companion for the "runs on every agent change" bar.
- Google DeepMind (5) / xAI (9) — smaller but real: DeepMind (Research Scientist, Multimodal Alignment, Safety); xAI skews alignment + monitoring.
Position by craft: dangerous-capability red-teaming, interpretability, oversight/monitoring, or safeguards engineering — the four highest-demand crafts.
2. If you want to sell to frontier labs
Labs build the safety science in-house (don't sell them alignment research), but they buy the operational layer:
- Red-team capacity & dangerous-capability eval data — OpenAI's biosafety/cyber red-team + Preparedness hiring = demand for bio/cyber eval datasets and automated red-teaming. Pitch: red-team-as-a-service, dangerous-capability eval suites.
- Monitoring / oversight / safeguards infrastructure — Anthropic's Safeguards tooling + data-infra roles = demand for detection pipelines, monitoring throughput, and the data behind them. Pitch: monitoring infra, abuse-detection data, harness throughput. (They build the auditing science — Petri, SHADE-Arena — so sell the throughput and data, not the method.)
- Eval/safeguards data & annotation — see the human-data report and the evals report; safety is a heavy buyer of red-team and policy-labeled data.
Buy-signal ranking: OpenAI (Preparedness red-team + dangerous-capability data) and Anthropic (Safeguards monitoring + data) are the buyers; the science stays in-house.
3. The connections
- OpenAI's Preparedness red-team roles + the Preparedness Framework ⇒ they're operationalizing dangerous-capability gating — every red-team role maps to a framework category; sell into the categories, not the model.
- Anthropic's Safeguards + Interpretability + Frontier Red Team ⇒ a vertically-integrated safety stack (detect → interpret → red-team) built as infrastructure — the buy surface is throughput and data, not research.
4. What the JDs actually say (deep dive)
Read the actual JDs for the top safety teams.
- OpenAI safety ships into production. The Alignment Oversight JD: "improving control, accountability, and alignment as AI systems become more capable and agentic… combine longer-horizon research with hands-on deployment… building oversight systems used in practice today — code review and action monitoring." Safety here is a deployed monitoring function, not only research.
- DeepMind's lens is sociotechnical. The Research Scientist, Multimodal Alignment, Safety JD (Frontier AI unit): "interdisciplinary sociotechnical modeling… modeling the interactions between AI and society." A different safety craft than OpenAI/Anthropic's systems focus.
- Anthropic = a vertically-integrated safety stack — Safeguards + Interpretability + Frontier Red Team, built as infrastructure (see the eval report's §6: "runs on every agent change").
What it means: get-hired — pick your craft (deployed oversight/monitoring at OpenAI, mechanistic interpretability at Anthropic, sociotechnical at DeepMind). Sell-to — OpenAI's "oversight used in practice today" = demand for monitoring infrastructure + action-classification data, not alignment research.
Method: 127 safety/alignment/red-team open roles from onlylabs (kind=job_opened, safety lexicon, de-noised of comms/legal/T&S-ops). §4 reads the actual JDs. Overlaps the evals report on the eval-harness slice — read both. Counts as of 2026-06-26; linked roles are live.