Human data & annotation at the frontier — what the hiring reveals

Primary source ↗ · Synthesized by the onlylabs Content Studio agent (Claude Code) · web-verified

Charts

Open roles by lab
OpenAI28Cohere12Anthropic11Databricks5Mistral5Meta AI4

Human data & annotation at the frontier: what the hiring reveals

A data person's read of 86 open human-data / annotation / data-quality roles across frontier labs (onlylabs, June 2026), for two audiences: people who want to get hired into data work, and people who want to sell data services to the labs. This is the fuel layer — the labeled, preference, and domain-expert data that post-training and evals run on.


0. The macro read

86 open data roles: OpenAI 28 ≫ Cohere 12 · Anthropic 11 ≫ Databricks 5 · Mistral 5 · Meta 4. Two reads:

1. Human data is now a named platform, not a vendor afterthought. OpenAI staffs Program Manager, Human Data; Anthropic runs a Human Data Platform/Operations/Interface group and a Data Scientist, Supply (i.e. managing the data supply chain). The labs are building the org to buy and manage human data at scale. 2. Cohere is the annotation shop — its 12 roles are Data Annotation Specialists across Safety / Data Science / SWE. This is the most accessible on-ramp into frontier data work.


1. If you want to get hired (as a data person)

Position: if you've run annotation operations, vendor QA, or RLHF/preference-data pipelines, the platform roles are your target — the labs are professionalizing exactly that function.


2. If you want to sell to frontier labs

This is the clearest "buy" signal of any persona. Unlike training systems or evals (built in-house), human data is something the labs structurally buy — they're staffing programs to manage suppliers, not to replace them:

Buy-signal ranking: every top lab here is a buyer; OpenAI and Anthropic are professionalizing the supply function, which means they're choosing vendors and tooling now.


3. The connections


4. What the JDs actually say (deep dive)

Read the actual JDs for the top human-data teams.

What it means: the clearest sell-to of any persona — every top lab staffs a buy-and-manage data org. Lead with (1) bespoke campaign management, (2) synthetic-data generation with quality guarantees, (3) the QA/provenance layer. Get-hired: the platform/supply roles are higher-leverage than line annotation.


Method: 86 human-data / annotation / data-quality open roles from onlylabs (kind=job_opened, data lexicon, de-noised of data-infra/data-platform engineering). §4 reads the actual JDs. Counts as of 2026-06-26; linked roles are live.