Deep reports

Agent-synthesized intelligence on frontier AI labs — eval analysis, system-card breakdowns, and persona-targeted hiring/selling intelligence. Each report cites primary sources.

GPT-5.6 Preview System Card — Eval Intelligence2026-06-26
Master benchmark inventory of the GPT-5.6 system card with web-verified cross-vendor comparison and the eval-shift Sankey — every claim tied to a page.
Eval / RL-environmentsOpenAI
Evals at the frontier labs — what the hiring reveals2026-06-26
What 128 open eval-relevant roles reveal about frontier eval investment, for two audiences: people who want to get hired, and people who want to sell to the labs.
Get hiredSell to labsAnthropic · OpenAI · Cohere
Infra & systems at the frontier — what the hiring reveals2026-06-26
What 715 open infrastructure/systems roles reveal about the GPU buildout — physical first — for two audiences: people who want to get hired in infra, and people who want to sell infra to the labs and neoclouds.
Get hiredSell to labsOpenAI · CoreWeave · Nebius · Anthropic · Cerebras
Safety & alignment at the frontier — what the hiring reveals2026-06-26
What 127 open safety/alignment/red-team roles reveal about the safety org (OpenAI out-hires Anthropic in raw count), for two audiences: get hired into safety, and sell safety tooling to the labs.
Get hiredSell to labsOpenAI · Anthropic · xAI · Google (DeepMind / Gemini)
Human data & annotation at the frontier — what the hiring reveals2026-06-26
What 86 open human-data / annotation / data-quality roles reveal about the fuel layer — the clearest "sell to the labs" buy signal, since labs structurally buy data rather than build it.
Get hiredSell to labsOpenAI · Cohere · Anthropic · Databricks (DBRX) · Mistral AI
GLM-5.2 — the open agentic-frontier play2026-06-16
Zhipu’s MIT-licensed flagship is in the Opus/GPT-5.5/Gemini tier on agentic SWE while trailing on broad knowledge — and its own cross-vendor table shows why "the harness is the benchmark".
Eval / RL-environmentsZhipu AI (GLM)