{"schema_version":"onlylabs.public_analysis.v1","url":"https://onlylabs.fyi/analysis/baseten","json_url":"https://onlylabs.fyi/analysis/baseten/analysis.json","evidence_json_url":"https://onlylabs.fyi/analysis/baseten/evidence.json","generated_at":"2026-06-27T22:30:16.541Z","analysis":{"org_slug":"baseten","url":"https://onlylabs.fyi/analysis/baseten","json_url":"https://onlylabs.fyi/analysis/baseten/analysis.json","evidence_json_url":"https://onlylabs.fyi/analysis/baseten/evidence.json","dossier_url":"https://onlylabs.fyi/labs/baseten","org":{"slug":"baseten","name":"Baseten","category":"neocloud","category_label":"Neocloud","homepage_url":"https://www.baseten.co"},"title":"Baseten analysis","summary":"Baseten is executing a decisive pivot from inference-specialist to full-stack AI infrastructure platform. The evidence shows a company simultaneously scaling toward enterprise across four fronts: (1) adding a training product alongside core inference, (2) building a research organization producing original work on speculative decoding, timestep distillation, and legal-agent post-training, (3) raising $1.5B Series F…","markdown":"## Thesis\n\nBaseten is executing a decisive pivot from inference-specialist to full-stack AI infrastructure platform. The evidence shows a company simultaneously scaling toward enterprise across four fronts: (1) adding a **training product** alongside core inference, (2) building a **research organization** producing original work on speculative decoding, timestep distillation, and legal-agent post-training, (3) raising **$1.5B Series F** to triple headcount across engineering, research, operations, and GTM, and (4) investing heavily in **multi-cloud infrastructure** spanning 10+ cloud providers with self-hosted and hybrid deployment modes [P25](https://www.baseten.co/blog/introducing-model-apis-and-training/)[W4](https://www.verdict.co.uk/baseten-secures-1-5bn-series-f/)[W1](https://www.baseten.co/blog/faster-image-generation-timestep-distillation-flux2/)[W2](https://www.baseten.co/research/post-training-frontier-legal-agents-with-baseten-research/)[P28](https://www.baseten.co/blog/how-baseten-multi-cloud-capacity-management-mcm-powers-cloud-self-hosted-and-hybr/). The pattern of fork activity — UCX/UCXX for high-speed networking, DeepGEMM for kernel optimization, compact-rl for RL training infrastructure — confirms a deep buildout of the underlying systems required to make the training-to-inference lifecycle a single integrated offering [E5](https://github.com/basetenlabs/ucx)[E6](https://github.com/basetenlabs/ucxx)[E26](https://github.com/basetenlabs/DeepGEMM)[E57](https://github.com/basetenlabs/compact-rl).\n\n## Signal desks\n\n### Hiring\n\n- **Engineering leadership buildout**: Baseten is hiring an Engineering Manager for Cloud Platform, Internal Platform, and Runtime Fabric — three distinct infrastructure teams — plus a Technical Program Manager for Infrastructure, signaling a maturing org structure moving from flat IC teams to formal management layers [E47](https://jobs.ashbyhq.com/baseten/0870ed34-7365-4b9f-a50a-481783b8c266)[E48](https://jobs.ashbyhq.com/baseten/1e721b74-58b9-4b03-ac0f-4b1e5c32342e)[E49](https://jobs.ashbyhq.com/baseten/aae72bd3-6f75-4238-9741-95fec11facb9)[E45](https://jobs.ashbyhq.com/baseten/7d9d5a1f-3834-434e-b22f-4bd62317be3c).\n- **Capacity and compute expansion**: A Capacity Strategy & Operations Lead and a Software Engineer — Capacity role on the Internal Platform team indicate active infrastructure scaling to support the $1.5B Series F growth plan and new training workloads [E33](https://jobs.ashbyhq.com/baseten/918dde84-09c6-4ee7-a0b9-a3e3253ac4b0)[E35](https://jobs.ashbyhq.com/baseten/902a7ddb-c21f-4272-aaab-879680697986).\n- **GTM and commercialization**: Senior Analyst, Revenue Strategy & Operations; Partnerships Product Marketing Manager; Customer Marketing Manager; and Field Productivity & Enablement Lead all point to a serious enterprise GTM buildout [E15](https://jobs.ashbyhq.com/baseten/6d32aa11-ac93-4f90-8f62-bdeb79214ee5)[E18](https://jobs.ashbyhq.com/baseten/132295d6-eeb4-4655-9847-a7e9a586d273)[E36](https://jobs.ashbyhq.com/baseten/6111c4a1-4e29-4fb8-aca3-1c8c8e0cbfb1)[E13](https://jobs.ashbyhq.com/baseten/5a1c6228-3906-4ccc-9988-d9bd67383b9d).\n- **Product and developer experience**: Product Manager, Developer Experience and Senior Frontend Engineer on the Dedicated Inference team signal investment in the UI/surface layer of the platform [E42](https://jobs.ashbyhq.com/baseten/2d78fdcf-53e1-45d3-a047-2aefb5ad3153)[E16](https://jobs.ashbyhq.com/baseten/3622a1ee-50a9-4c45-af6e-aa12bd5de22f).\n- **Financial scaling**: Strategic Finance Associate / Sr. Associate on the G&A team reflects the organizational demands of a company absorbing $1.5B in new capital [E51](https://jobs.ashbyhq.com/baseten/71a011b6-0f17-4c0a-b0ba-38deffce1adb).\n- **Geographic concentration**: All cited open roles are based in San Francisco, indicating HQ-centric hiring even during rapid scaling [E13](https://jobs.ashbyhq.com/baseten/5a1c6228-3906-4ccc-9988-d9bd67383b9d)[E15](https://jobs.ashbyhq.com/baseten/6d32aa11-ac93-4f90-8f62-bdeb79214ee5)[E16](https://jobs.ashbyhq.com/baseten/3622a1ee-50a9-4c45-af6e-aa12bd5de22f)[E18](https://jobs.ashbyhq.com/baseten/132295d6-eeb4-4655-9847-a7e9a586d273)[E33](https://jobs.ashbyhq.com/baseten/918dde84-09c6-4ee7-a0b9-a3e3253ac4b0)[E35](https://jobs.ashbyhq.com/baseten/902a7ddb-c21f-4272-aaab-879680697986)[E36](https://jobs.ashbyhq.com/baseten/6111c4a1-4e29-4fb8-aca3-1c8c8e0cbfb1)[E42](https://jobs.ashbyhq.com/baseten/2d78fdcf-53e1-45d3-a047-2aefb5ad3153)[E45](https://jobs.ashbyhq.com/baseten/7d9d5a1f-3834-434e-b22f-4bd62317be3c)[E47](https://jobs.ashbyhq.com/baseten/0870ed34-7365-4b9f-a50a-481783b8c266)[E48](https://jobs.ashbyhq.com/baseten/1e721b74-58b9-4b03-ac0f-4b1e5c32342e)[E49](https://jobs.ashbyhq.com/baseten/aae72bd3-6f75-4238-9741-95fec11facb9)[E51](https://jobs.ashbyhq.com/baseten/71a011b6-0f17-4c0a-b0ba-38deffce1adb).\n- **Scale ambition**: The Series F coverage reports Baseten plans to triple headcount this year, with focus on engineering, research, operations, and go-to-market teams [W4](https://www.verdict.co.uk/baseten-secures-1-5bn-series-f/).\n\n### Forks\n\n- **Kernel and networking layer**: Forks of `openucx/ucx` and `rapidsai/ucxx` (Unified Communication X) signal work on high-speed GPU-to-GPU interconnects critical for multi-node inference and distributed training [E5](https://github.com/basetenlabs/ucx)[E6](https://github.com/basetenlabs/ucxx). The `deepseek-ai/DeepGEMM` fork points to custom kernel optimization for matrix multiplication workloads [E26](https://github.com/basetenlabs/DeepGEMM). `opencontainers/runc` fork suggests container runtime tuning for inference workloads [E56](https://github.com/basetenlabs/runc).\n- **RL and post-training infrastructure**: Fork of `PrimeIntellect-ai/prime-rl` as `compact-rl` (7 stars) and `modelscope/mcore-bridge` indicate active buildout of reinforcement learning and model-core bridging tooling for the training product line [E57](https://github.com/basetenlabs/compact-rl)[E59](https://github.com/basetenlabs/mcore-bridge). Fork of `thinking-machines-lab/tinker-cookbook` (1 star) suggests evaluation or recipe work for model fine-tuning [E55](https://github.com/basetenlabs/tinker-cookbook).\n- **Model optimization and serving**: `lightseekorg/TorchSpec` fork relates to speculative decoding research [E58](https://github.com/basetenlabs/TorchSpec). `ucb-bar/autocomp` fork suggests automated compilation work for hardware optimization [E60](https://github.com/basetenlabs/autocomp). `ideogram-oss/ideogram4` fork may connect to image generation model serving [E54](https://github.com/basetenlabs/ideogram4).\n- **CI/CD and DevTools**: Forks of `mikepenz/action-junit-report` and `moonrepo/run-report-action` (the latter released as v1) indicate internal CI/CD pipeline investment [P2](https://github.com/basetenlabs/action-junit-report)[P3](https://github.com/basetenlabs/run-report-action)[P1](https://github.com/basetenlabs/run-report-action/releases/tag/v1). `GoogleContainerTools/container-debug-support` fork points to debugging tooling for containerized inference environments [E22](https://github.com/basetenlabs/container-debug-support).\n- **LangChain integration**: `alexzhang13/rlm` fork suggests work on agent/RLM (Reinforcement Learning from Model feedback) integration pathways [E3](https://github.com/basetenlabs/rlm).\n\n### Releases\n\n- **Truss SDK rapid iteration**: The `basetenlabs/truss` repo released 11 versions from v0.18.7 through v0.18.17 within approximately two weeks (June 9–24), including an RC (v0.18.16rc0), indicating active development on the core model packaging and deployment toolchain [E46](https://github.com/basetenlabs/truss/releases/tag/v0.18.7)[E43](https://github.com/basetenlabs/truss/releases/tag/v0.18.8)[E39](https://github.com/basetenlabs/truss/releases/tag/v0.18.9)[E34](https://github.com/basetenlabs/truss/releases/tag/v0.18.10)[E32](https://github.com/basetenlabs/truss/releases/tag/v0.18.11)[E31](https://github.com/basetenlabs/truss/releases/tag/v0.18.12)[E28](https://github.com/basetenlabs/truss/releases/tag/v0.18.13)[E25](https://github.com/basetenlabs/truss/releases/tag/v0.18.14)[E21](https://github.com/basetenlabs/truss/releases/tag/v0.18.15)[E17](https://github.com/basetenlabs/truss/releases/tag/v0.18.16)[E11](https://github.com/basetenlabs/truss/releases/tag/v0.18.17)[E19](https://github.com/basetenlabs/truss/releases/tag/v0.18.16rc0).\n- **Multi-language client expansion**: `baseten-go` v0.1.0 and `baseten-python` v0.9.0 show the platform building SDK support beyond the original Python tooling [E52](https://github.com/basetenlabs/baseten-go/releases/tag/v0.1.0)[E53](https://github.com/basetenlabs/baseten-python/releases/tag/v0.9.0). `baseten-cli` v0.2.0 indicates a dedicated CLI product separate from Truss [E12](https://github.com/basetenlabs/baseten-cli/releases/tag/v0.2.0).\n- **Ecosystem integration**: `langchain-baseten` libs/baseten/v0.2.1 updates the LangChain integration, maintaining compatibility with the broader agent/LLM ecosystem [E4](https://github.com/basetenlabs/langchain-baseten/releases/tag/libs/baseten/v0.2.1).\n- **CI tooling**: `basetenlabs/run-report-action` v1 (forked from moonrepo) provides CI run reporting for internal moon-based workflows [P1](https://github.com/basetenlabs/run-report-action/releases/tag/v1)[P3](https://github.com/basetenlabs/run-report-action).\n\n### Talking\n\n- **Strategic funding narrative**: Series F announcement ($1.5B) is the dominant external signal, framed around inference demand and plans to triple headcount [E2](https://www.baseten.co/blog/announcing-our-series-f/)[W4](https://www.verdict.co.uk/baseten-secures-1-5bn-series-f/). Earlier Series C coverage ($75M) from February 2025 established the inference-as-mission-critical thesis [P8](https://www.baseten.co/blog/announcing-baseten-75m-series-c/).\n- **Inference performance thought leadership**: Baseten publishes heavily on inference benchmarks — GH200 vs H100/H200 for Llama 3.3 70B, B200 GPU acceleration (5x throughput, 38% lower latency), day-zero Qwen 3 benchmarks with SGLang, and the \"world's fastest API for GLM 5.2\" [P5](https://www.baseten.co/blog/testing-llama-inference-performance-nvidia-gh200-lambda-cloud/)[P18](https://www.baseten.co/blog/accelerating-inference-nvidia-b200-gpus/)[P20](https://www.baseten.co/blog/day-zero-benchmarks-for-qwen-3-with-sglang-on-baseten/)[E1](https://www.baseten.co/blog/how-we-built-the-worlds-fastest-api-for-glm-52/). The embedding performance narrative is especially strong: BEI claims 2x throughput and 10% lower latency vs competitors, with a 12x client-side boost via the Rust-based Performance Client [P7](https://www.baseten.co/blog/introducing-baseten-embeddings-inference-bei/)[P11](https://www.baseten.co/blog/how-we-built-bei-high-throughput-embedding-inference/)[P26](https://www.baseten.co/blog/your-client-code-matters-10x-higher-embedding-throughput-with-python-and-rust/).\n- **Research output emerging**: Timestep distillation for FLUX.2 (2.5x faster image generation) and live draft model training for speculative decoding represent original applied research [W1](https://www.baseten.co/blog/faster-image-generation-timestep-distillation-flux2/)[E9](https://www.baseten.co/blog/live-draft-model-training-for-speculative-decoding/). Post-training frontier legal agents with Harvey on the LAB benchmark, using \"Baseten Research\" as a named entity, signals a formal research function [W2](https://www.baseten.co/research/post-training-frontier-legal-agents-with-baseten-research/).\n- **Product expansion**: Model APIs and Training launch (May 2025) is framed as covering the \"inference lifecycle,\" adding training infrastructure that supports fine-tuning and RLHF workloads [P25](https://www.baseten.co/blog/introducing-model-apis-and-training/). Baseten Chains GA (Feb 2025) targets compound AI systems with independent autoscaling per step [P4](https://www.baseten.co/blog/baseten-chains-for-production-compound-ai-systems/)[P6](https://www.baseten.co/resources/changelog/baseten-chains-is-now-ga-deploy-ultra-low-latency-compound-ai-at-scale/).\n- **Infrastructure depth**: Multi-cloud capacity management (MCM) blog explains how Baseten operates across 10+ cloud providers with Cloud/Self-hosted/Hybrid deployment modes and 99.99% uptime [P28](https://www.baseten.co/blog/how-baseten-multi-cloud-capacity-management-mcm-powers-cloud-self-hosted-and-hybr/). Forward Deployed Engineering (FDE) blog explains the customer-engineering model for accelerating adoption [P27](https://www.baseten.co/blog/forward-deployed-engineering/).\n- **Open-source advocacy**: Multiple posts guide users on switching from closed-source to open-source models, GPU selection guides (H100, H200, multi-node), and embedding model deployment — reinforcing the platform's positioning as the bridge from open weights to production [P12](https://www.baseten.co/blog/a-checklist-for-switching-to-open-source-ml-models/)[P13](https://www.baseten.co/blog/deployment-and-inference-for-open-source-text-embedding-models/)[P9](https://www.baseten.co/blog/how-multi-node-inference-works-llms-deepseek-r1/)[E27](https://www.baseten.co/blog/the-best-open-source-large-language-models-llms/).\n- **Ecosystem and partnerships**: Canopy Labs selects Baseten as preferred inference provider for Orpheus TTS (100K+ HuggingFace downloads), Chroma vector database integration, NVIDIA BioNeMo agent toolkit support, and partnerships with Retool, OpenRouter, and Poe for Model APIs launch [P22](https://www.baseten.co/blog/canopy-labs-selects-baseten-as-preferred-inference-provider-for-orpheus-tts-model/)[P15](https://www.baseten.co/blog/building-performant-embedding-workflows-with-chroma-and-baseten/)[E20](https://www.baseten.co/blog/nvidia-bionemo-agent-toolkit-on-baseten/)[P25](https://www.baseten.co/blog/introducing-model-apis-and-training/).\n- **Developer experience**: Changelog posts track iterative improvements — streaming logs from terminal, flexible instance types per deployment, OpenAI-compatible APIs, docs refresh, async log downloads, log export to OTLP endpoints, rolling deployments, container restart tracking, vLLM/SGLang metrics, and CLI log filtering/streaming [P16](https://www.baseten.co/resources/changelog/stream-baseten-logs-from-terminal/)[P17](https://www.baseten.co/resources/changelog/flexible-instance-types-per-model-deployment/)[P10](https://www.baseten.co/resources/changelog/baseten-is-fully-openai-compatible/)[P14](https://www.baseten.co/resources/changelog/docs-refresh/)[E8](https://www.baseten.co/resources/changelog/async-log-downloads/)[E50](https://www.baseten.co/resources/changelog/log-export-to-otlp-endpoints/)[E37](https://www.baseten.co/blog/rolling-deployments-zero-downtime-model-updates/)[E40](https://www.baseten.co/resources/changelog/container-restart-tracking/)[E44](https://www.baseten.co/resources/changelog/vllm-and-sglang-metrics/)[E24](https://www.baseten.co/resources/changelog/filter-and-stream-model-logs-from-the-cli/).\n- **Model catalog velocity**: GLM 5.2, Kimi K2.7 Coder, Mercury 2, and MAI-Thinking-1 are recent model additions or announcements, with deprecation notices for DeepSeek V3.1 and MiniMax M2.5 indicating active catalog curation [E29](https://www.baseten.co/resources/changelog/glm-52-available-on-baseten/)[E30](https://www.baseten.co/resources/changelog/kimi-k27-coder-on/)[E41](https://www.baseten.co/blog/mercury-2-is-now-available-on-baseten/)[W3](https://www.baseten.co/blog/mai-thinking-1/)[E10](https://www.baseten.co/resources/changelog/model-deprecation-deepseek-v31-minimax-m25/).\n- **Brand repositioning**: May 2025 rebrand frames Baseten as \"the building blocks of AI\" with the tagline \"inference is everything\" — a positioning shift toward being the foundational infrastructure layer for all AI [P21](https://www.baseten.co/resources/changelog/introducing-our-new-brand/)[P24](https://www.baseten.co/blog/introducing-our-new-brand/).\n\n## Shipping\n\n- **Model APIs and Training (May 2025)**: The most significant product expansion. Model APIs offer production-grade access to open-source models (launching with 4 models including DeepSeek V3/R1, Llama 4, Qwen 3), while Training adds infrastructure for fine-tuning and RLHF workloads. Described as covering \"the inference lifecycle\" and enabling the path from closed-source API consumption to dedicated infrastructure [P25](https://www.baseten.co/blog/introducing-model-apis-and-training/).\n- **Baseten Chains GA (Feb 2025)**: SDK for compound AI systems enabling multi-model workflows with independent hardware and autoscaling per step, targeting ultra-low-latency production deployments. [P4](https://www.baseten.co/blog/baseten-chains-for-production-compound-ai-systems/)[P6](https://www.baseten.co/resources/changelog/baseten-chains-is-now-ga-deploy-ultra-low-latency-compound-ai-at-scale/).\n- **Baseten Embeddings Inference (BEI) (Mar 2025)**: Purpose-built embedding/reranker/classifier runtime using TensorRT-LLM, claiming 2x higher throughput and 10% lower latency than prior solutions [P7](https://www.baseten.co/blog/introducing-baseten-embeddings-inference-bei/)[P11](https://www.baseten.co/blog/how-we-built-bei-high-throughput-embedding-inference/).\n- **Performance Client (Jun 2025)**: Open-source Python library with Rust core for up to 12x embedding throughput improvement via GIL-free parallel request execution, OpenAI-compatible [P26](https://www.baseten.co/blog/your-client-code-matters-10x-higher-embedding-throughput-with-python-and-rust/).\n- **NVIDIA B200 GPUs early access (Apr 2025)**: First inference platform to offer B200s, claiming 5x higher throughput, 50%+ lower cost per token, and 38% lower latency vs Hopper-generation hardware [P18](https://www.baseten.co/blog/accelerating-inference-nvidia-b200-gpus/)[P19](https://www.baseten.co/resources/changelog/early-access-announcing-b200s-on-baseten/).\n- **OpenAI-compatible APIs (Mar 2025)**: Full chat completions and completions API compatibility with the OpenAI SDK, enabling drop-in migration [P10](https://www.baseten.co/resources/changelog/baseten-is-fully-openai-compatible/).\n- **Multi-cloud capacity management (MCM)**: Unified control plane across 10+ cloud providers supporting Cloud, Self-hosted, and Hybrid deployment modes with 99.99% uptime and SOC 2 Type II, HIPAA, GDPR compliance [P28](https://www.baseten.co/blog/how-baseten-multi-cloud-capacity-management-mcm-powers-cloud-self-hosted-and-hybr/).\n- **Baseten Loops (May 2026)**: Training SDK for iterative, production-quality post-training workflows supporting long-sequence fine-tuning, RLHF, and async RL pipelines [W3](https://www.baseten.co/blog/mai-thinking-1/).\n- **Rolling deployments**: Zero-downtime model updates for production inference [E37](https://www.baseten.co/blog/rolling-deployments-zero-downtime-model-updates/).\n\n## Research themes\n\n- **Speculative decoding and inference acceleration**: Live draft model training for speculative decoding represents original applied research into reducing inference latency [E9](https://www.baseten.co/blog/live-draft-model-training-for-speculative-decoding/). Day-zero Qwen 3 optimization with SGLang demonstrates capability to productionize new model architectures within hours of weight release [P20](https://www.baseten.co/blog/day-zero-benchmarks-for-qwen-3-with-sglang-on-baseten/).\n- **Timestep distillation (image generation)**: Applying Distribution Matching Distillation (DMD) to FLUX.2 to reduce sampling from 20 to 4–8 steps while preserving quality, with a distilled model released on HuggingFace — signals a research capability extending beyond text models into diffusion models [W1](https://www.baseten.co/blog/faster-image-generation-timestep-distillation-flux2/).\n- **Embedding inference optimization**: BEI built on TensorRT-LLM addresses the unique dual workload of high-throughput corpus processing and low-latency real-time querying, with the Performance Client adding a client-side optimization layer using Rust to bypass Python's GIL [P11](https://www.baseten.co/blog/how-we-built-bei-high-throughput-embedding-inference/)[P26](https://www.baseten.co/blog/your-client-code-matters-10x-higher-embedding-throughput-with-python-and-rust/).\n- **Post-training for domain-specific agents**: Collaboration with Harvey on post-training a 27B open-weight model for legal reasoning to reach \"the closed-source frontier band on LAB\" using in-the-loop training harnesses [W2](https://www.baseten.co/research/post-training-frontier-legal-agents-with-baseten-research/).\n- **Multi-node inference systems**: DeepSeek-R1 serving across 16 H100 GPUs in multi-node configuration required solving both infrastructure (interconnects, multi-cloud) and model performance (tensor parallelism, KV cache distribution) challenges [P9](https://www.baseten.co/blog/how-multi-node-inference-works-llms-deepseek-r1/).\n- **Hardware benchmarking and optimization**: Systematic testing across GH200, H100, H200, and B200 GPUs for inference workloads, with published comparisons including GH200's NVLink-C2C advantage for KV cache offloading [P5](https://www.baseten.co/blog/testing-llama-inference-performance-nvidia-gh200-lambda-cloud/)[P18](https://www.baseten.co/blog/accelerating-inference-nvidia-b200-gpus/).\n- **Compound AI systems**: Chains GA addresses model orchestration, inter-model latency, reliability, and cost-efficiency for multi-step AI workflows [P4](https://www.baseten.co/blog/baseten-chains-for-production-compound-ai-systems/).\n\n## Hiring & scaling\n\nEvidence of a company in a major scaling phase:\n\n- **$1.5B Series F** to fund tripling of headcount, with stated focus on engineering, research, operations, and GTM [W4](https://www.verdict.co.uk/baseten-secures-1-5bn-series-f/). Earlier $75M Series C (Feb 2025) funded the initial platform buildout [P8](https://www.baseten.co/blog/announcing-baseten-75m-series-c/).\n- **Management layer formation**: Simultaneous hiring of four distinct Engineering Manager roles (Cloud Platform, Internal Platform, Runtime Fabric, Infrastructure TPM) signals transition from founder-led IC teams to structured engineering organization [E47](https://jobs.ashbyhq.com/baseten/0870ed34-7365-4b9f-a50a-481783b8c266)[E48](https://jobs.ashbyhq.com/baseten/1e721b74-58b9-4b03-ac0f-4b1e5c32342e)[E49](https://jobs.ashbyhq.com/baseten/aae72bd3-6f75-4238-9741-95fec11facb9)[E45](https://jobs.ashbyhq.com/baseten/7d9d5a1f-3834-434e-b22f-4bd62317be3c).\n- **Compute and capacity roles**: Dedicated Capacity Strategy & Operations Lead and Software Engineer — Capacity indicate the GPU supply chain and infrastructure scaling are now specialized functions requiring dedicated headcount [E33](https://jobs.ashbyhq.com/baseten/918dde84-09c6-4ee7-a0b9-a3e3253ac4b0)[E35](https://jobs.ashbyhq.com/baseten/902a7ddb-c21f-4272-aaab-879680697986).\n- **GTM team buildout**: Revenue Strategy, Product Marketing, Customer Marketing, Field Productivity & Enablement, and Partnerships PMM roles collectively point to a multi-channel enterprise GTM motion being stood up [E15](https://jobs.ashbyhq.com/baseten/6d32aa11-ac93-4f90-8f62-bdeb79214ee5)[E18](https://jobs.ashbyhq.com/baseten/132295d6-eeb4-4655-9847-a7e9a586d273)[E36](https://jobs.ashbyhq.com/baseten/6111c4a1-4e29-4fb8-aca3-1c8c8e0cbfb1)[E13](https://jobs.ashbyhq.com/baseten/5a1c6228-3906-4ccc-9988-d9bd67383b9d).\n- **Developer Experience investment**: A dedicated PM for Developer Experience alongside a Senior Frontend Engineer for Dedicated Inference suggests the platform's UI and API surfaces are receiving focused product attention [E42](https://jobs.ashbyhq.com/baseten/2d78fdcf-53e1-45d3-a047-2aefb5ad3153)[E16](https://jobs.ashbyhq.com/baseten/3622a1ee-50a9-4c45-af6e-aa12bd5de22f).\n- **San Francisco consolidation**: All cited roles are San Francisco-based, suggesting co-located scaling rather than distributed — notable given the multi-cloud infrastructure story [E13](https://jobs.ashbyhq.com/baseten/5a1c6228-3906-4ccc-9988-d9bd67383b9d)[E15](https://jobs.ashbyhq.com/baseten/6d32aa11-ac93-4f90-8f62-bdeb79214ee5)[E16](https://jobs.ashbyhq.com/baseten/3622a1ee-50a9-4c45-af6e-aa12bd5de22f)[E18](https://jobs.ashbyhq.com/baseten/132295d6-eeb4-4655-9847-a7e9a586d273)[E33](https://jobs.ashbyhq.com/baseten/918dde84-09c6-4ee7-a0b9-a3e3253ac4b0)[E35](https://jobs.ashbyhq.com/baseten/902a7ddb-c21f-4272-aaab-879680697986)[E36](https://jobs.ashbyhq.com/baseten/6111c4a1-4e29-4fb8-aca3-1c8c8e0cbfb1)[E42](https://jobs.ashbyhq.com/baseten/2d78fdcf-53e1-45d3-a047-2aefb5ad3153)[E45](https://jobs.ashbyhq.com/baseten/7d9d5a1f-3834-434e-b22f-4bd62317be3c)[E47](https://jobs.ashbyhq.com/baseten/0870ed34-7365-4b9f-a50a-481783b8c266)[E48](https://jobs.ashbyhq.com/baseten/1e721b74-58b9-4b03-ac0f-4b1e5c32342e)[E49](https://jobs.ashbyhq.com/baseten/aae72bd3-6f75-4238-9741-95fec11facb9)[E51](https://jobs.ashbyhq.com/baseten/71a011b6-0f17-4c0a-b0ba-38deffce1adb).\n- **Finance function scaling**: Strategic Finance hire at the Associate/Sr. Associate level indicates the G&A infrastructure needed to manage $1.5B in new capital [E51](https://jobs.ashbyhq.com/baseten/71a011b6-0f17-4c0a-b0ba-38deffce1adb).\n\n## Category implications\n\n- **Inference-to-training platform convergence**: With Model APIs and Training plus Loops SDK, Baseten is executing the same platform-expansion strategy seen at other neocloud providers: start with inference, add production training/post-training, and capture the full model lifecycle. This directly competes with dedicated training infrastructure providers while leveraging existing inference relationships [P25](https://www.baseten.co/blog/introducing-model-apis-and-training/)[W3](https://www.baseten.co/blog/mai-thinking-1/).\n- **Multi-cloud as competitive moat**: MCM across 10+ providers with self-hosted and hybrid deployment modes addresses enterprise compliance and vendor lock-in concerns. This architecture requires significant engineering investment (reflected in UCX/UCXX networking forks and capacity hiring) but creates a defensible position against single-cloud inference providers [P28](https://www.baseten.co/blog/how-baseten-multi-cloud-capacity-management-mcm-powers-cloud-self-hosted-and-hybr/)[E5](https://github.com/basetenlabs/ucx)[E6](https://github.com/basetenlabs/ucxx)[E33](https://jobs.ashbyhq.com/baseten/918dde84-09c6-4ee7-a0b9-a3e3253ac4b0).\n- **Research as product differentiator**: The emergence of \"Baseten Research\" as a named entity, with original work on timestep distillation, speculative decoding, and legal-agent post-training, mirrors the strategy of frontier labs using published research to signal technical depth to enterprise buyers. The FLUX.2 distilled model released on HuggingFace is a concrete artifact of this strategy [W1](https://www.baseten.co/blog/faster-image-generation-timestep-distillation-flux2/)[W2](https://www.baseten.co/research/post-training-frontier-legal-agents-with-baseten-research/)[E9](https://www.baseten.co/blog/live-draft-model-training-for-speculative-decoding/).\n- **DevEx as GTM wedge**: The high-velocity Truss release cadence (11 versions in ~2 weeks), multi-language SDK expansion (Go, Python), CLI tooling, and Developer Experience PM hire indicate that developer tooling quality is being treated as a primary GTM channel rather than a support function [E46](https://github.com/basetenlabs/truss/releases/tag/v0.18.7)[E52](https://github.com/basetenlabs/baseten-go/releases/tag/v0.1.0)[E53](https://github.com/basetenlabs/baseten-python/releases/tag/v0.9.0)[E12](https://github.com/basetenlabs/baseten-cli/releases/tag/v0.2.0)[E42](https://jobs.ashbyhq.com/baseten/2d78fdcf-53e1-45d3-a047-2aefb5ad3153).\n- **Embeddings as a volume play**: BEI plus the Performance Client targeting 12x throughput gains suggests Baseten sees embedding workloads as a high-volume, lower-margin entry point that can convert to higher-value LLM and training workloads — a classic land-and-expand infrastructure strategy [P7](https://www.baseten.co/blog/introducing-baseten-embeddings-inference-bei/)[P11](https://www.baseten.co/blog/how-we-built-bei-high-throughput-embedding-inference/)[P26](https://www.baseten.co/blog/your-client-code-matters-10x-higher-embedding-throughput-with-python-and-rust/).\n- **Open-source alignment**: Every product announcement (Model APIs, Chains, BEI, Training, Loops) prominently features open-source model support — Llama, DeepSeek, Qwen, Whisper, Orpheus TTS, GLM, Kimi K2, Mercury 2. The platform is positioning as the neutral, open-weights-first infrastructure layer in a market where closed-source API lock-in is the incumbent advantage [P25](https://www.baseten.co/blog/introducing-model-apis-and-training/)[P20](https://www.baseten.co/blog/day-zero-benchmarks-for-qwen-3-with-sglang-on-baseten/)[P22](https://www.baseten.co/blog/canopy-labs-selects-baseten-as-preferred-inference-provider-for-orpheus-tts-model/)[E29](https://www.baseten.co/resources/changelog/glm-52-available-on-baseten/)[E30](https://www.baseten.co/resources/changelog/kimi-k27-coder-on/)[E41](https://www.baseten.co/blog/mercury-2-is-now-available-on-baseten/)[P12](https://www.baseten.co/blog/a-checklist-for-switching-to-open-source-ml-models/).\n- **Enterprise compliance signaling**: SOC 2 Type II, HIPAA, GDPR, self-hosted VPC deployment, and the MCM architecture explicitly target regulated industries. The Harvey legal-agent partnership and BioNeMo agent toolkit support further signal vertical-specific enterprise GTM [P28](https://www.baseten.co/blog/how-baseten-multi-cloud-capacity-management-mcm-powers-cloud-self-hosted-and-hybr/)[W2](https://www.baseten.co/research/post-training-frontier-legal-agents-with-baseten-research/)[E20](https://www.baseten.co/blog/nvidia-bionemo-agent-toolkit-on-baseten/).\n\n## Traction highlights\n\n- **Capital raised**: $75M Series C (Feb 2025) followed by $1.5B Series F (Jun 2026), indicating rapid valuation growth and investor conviction in the inference-platform thesis [P8](https://www.baseten.co/blog/announcing-baseten-75m-series-c/)[W4](https://www.verdict.co.uk/baseten-secures-1-5bn-series-f/)[E2](https://www.baseten.co/blog/announcing-our-series-f/).\n- **Named enterprise customers**: Abridge, OpenEvidence, Gamma, Writer, and Patreon cited as production inference customers using the platform at scale [P24](https://www.baseten.co/blog/introducing-our-new-brand/). Canopy Labs selected Baseten as preferred inference provider for Orpheus TTS, which achieved 100K+ HuggingFace downloads as a top-5 trending model [P22](https://www.baseten.co/blog/canopy-labs-selects-baseten-as-preferred-inference-provider-for-orpheus-tts-model/).\n- **Launch partners**: Retool, OpenRouter, and Poe named as partners helping bring Model APIs to launch readiness [P25](https://www.baseten.co/blog/introducing-model-apis-and-training/). Chroma integration with official Baseten support for the vector database ecosystem [P15](https://www.baseten.co/blog/building-performant-embedding-workflows-with-chroma-and-baseten/).\n- **Model catalog breadth**: Platform supports GLM 5.2, Kimi K2.7 Code, DeepSeek V4, GPT OSS 120B, Whisper Large V3, NVIDIA Nemotron 3 Ultra, Qwen 3, Llama 4, DeepSeek-R1/V3, Mercury 2, and MAI-Thinking-1 (forthcoming) [P6](https://www.baseten.co/resources/changelog/baseten-chains-is-now-ga-deploy-ultra-low-latency-compound-ai-at-scale/)[E29](https://www.baseten.co/resources/changelog/glm-52-available-on-baseten/)[E30](https://www.baseten.co/resources/changelog/kimi-k27-coder-on/)[E41](https://www.baseten.co/blog/mercury-2-is-now-available-on-baseten/)[W3](https://www.baseten.co/blog/mai-thinking-1/)[P20](https://www.baseten.co/blog/day-zero-benchmarks-for-qwen-3-with-sglang-on-baseten/).\n- **Performance claims**: 5x throughput and 38% lower latency on B200 vs Hopper, 2x embedding throughput with BEI, 12x client-side throughput with Performance Client, 16–24 simultaneous TTS streams on half an H100, and day-zero optimization of new model releases (Qwen 3, GLM 5.2) [P18](https://www.baseten.co/blog/accelerating-inference-nvidia-b200-gpus/)[P7](https://www.baseten.co/blog/introducing-baseten-embeddings-inference-bei/)[P26](https://www.baseten.co/blog/your-client-code-matters-10x-higher-embedding-throughput-with-python-and-rust/)[P22](https://www.baseten.co/blog/canopy-labs-selects-baseten-as-preferred-inference-provider-for-orpheus-tts-model/)[P20](https://www.baseten.co/blog/day-zero-benchmarks-for-qwen-3-with-sglang-on-baseten/)[E1](https://www.baseten.co/blog/how-we-built-the-worlds-fastest-api-for-glm-52/).\n- **Infrastructure scale**: Thousands of GPUs across 10+ cloud providers, multiple regions globally, with 99.99% uptime [P28](https://www.baseten.co/blog/how-baseten-multi-cloud-capacity-management-mcm-powers-cloud-self-hosted-and-hybr/).\n\n## Sources\n\n- [P1](https://github.com/basetenlabs/run-report-action/releases/tag/v1) basetenlabs/run-report-action v1 release\n- [P2](https://github.com/basetenlabs/action-junit-report) basetenlabs/action-junit-report repo metadata (fork of mikepenz/action-junit-report)\n- [P3](https://github.com/basetenlabs/run-report-action) basetenlabs/run-report-action repo metadata (fork of moonrepo/run-report-action)\n- [P4](https://www.baseten.co/blog/baseten-chains-for-production-compound-ai-systems/) \"Baseten Chains for Production Compound AI Systems\" blog post\n- [P5](https://www.baseten.co/blog/testing-llama-inference-performance-nvidia-gh200-lambda-cloud/) \"Testing Llama Inference Performance Nvidia GH200 Lambda Cloud\" blog post\n- [P6](https://www.baseten.co/resources/changelog/baseten-chains-is-now-ga-deploy-ultra-low-latency-compound-ai-at-scale/) \"Baseten Chains Is Now GA\" changelog\n- [P7](https://www.baseten.co/blog/introducing-baseten-embeddings-inference-bei/) \"Introducing Baseten Embeddings Inference (BEI)\" blog post\n- [P8](https://www.baseten.co/blog/announcing-baseten-75m-series-c/) \"Announcing Baseten's $75M Series C\" blog post\n- [P9](https://www.baseten.co/blog/how-multi-node-inference-works-llms-deepseek-r1/) \"How Multi Node Inference Works for LLMs like DeepSeek-R1\" blog post\n- [P10](https://www.baseten.co/resources/changelog/baseten-is-fully-openai-compatible/) \"Baseten is now fully OpenAI compatible\" changelog\n- [P11](https://www.baseten.co/blog/how-we-built-bei-high-throughput-embedding-inference/) \"How We Built BEI: High-Throughput Embedding Inference\" blog post\n- [P12](https://www.baseten.co/blog/a-checklist-for-switching-to-open-source-ml-models/) \"A Checklist For Switching To Open Source ML Models\" blog post\n- [P13](https://www.baseten.co/blog/deployment-and-inference-for-open-source-text-embedding-models/) \"Deployment and Inference for Open Source Text Embedding Models\" blog post\n- [P14](https://www.baseten.co/resources/changelog/docs-refresh/) \"Docs Refresh\" changelog\n- [P15](https://www.baseten.co/blog/building-performant-embedding-workflows-with-chroma-and-baseten/) \"Building Performant Embedding Workflows with Chroma and Baseten\" blog post\n- [P16](https://www.baseten.co/resources/changelog/stream-baseten-logs-from-terminal/) \"Stream Baseten Logs From Terminal\" changelog\n- [P17](https://www.baseten.co/resources/changelog/flexible-instance-types-per-model-deployment/) \"Flexible Instance Types Per Model Deployment\" changelog\n- [P18](https://www.baseten.co/blog/accelerating-inference-nvidia-b200-gpus/) \"Accelerating Inference with NVIDIA B200 GPUs\" blog post\n- [P19](https://www.baseten.co/resources/changelog/early-access-announcing-b200s-on-baseten/) \"Early Access: Announcing B200s on Baseten\" changelog\n- [P20](https://www.baseten.co/blog/day-zero-benchmarks-for-qwen-3-with-sglang-on-baseten/) \"Day Zero Benchmarks for Qwen 3 with SGLang on Baseten\" blog post\n- [P21](https://www.baseten.co/resources/changelog/introducing-our-new-brand/) \"Introducing Our New Brand\" changelog\n- [P22](https://www.baseten.co/blog/canopy-labs-selects-baseten-as-preferred-inference-provider-for-orpheus-tts-model/) \"Canopy Labs Selects Baseten as Preferred Inference Provider for Orpheus TTS\" blog post\n- [P23](https://www.baseten.co/blog/ai-inference-explained/) \"AI Inference Explained\" blog post\n- [P24](https://www.baseten.co/blog/introducing-our-new-brand/) \"Introducing Our New Brand\" blog post\n- [P25](https://www.baseten.co/blog/introducing-model-apis-and-training/) \"Introducing Model APIs and Training\" blog post\n- [P26](https://www.baseten.co/blog/your-client-code-matters-10x-higher-embedding-throughput-with-python-and-rust/) \"Your Client Code Matters: 12x Higher Embedding Throughput with Python and Rust\" blog post\n- [P27](https://www.baseten.co/blog/forward-deployed-engineering/) \"Forward Deployed Engineering\" blog post\n- [P28](https://www.baseten.co/blog/how-baseten-multi-cloud-capacity-management-mcm-powers-cloud-self-hosted-and-hybr/) \"How Baseten Multi-Cloud Capacity Management (MCM) Unifies Deployments\" blog post\n- [E1](https://www.baseten.co/blog/how-we-built-the-worlds-fastest-api-for-glm-52/) \"How We Built The World's Fastest API for GLM 5.2\" event\n- [E2](https://www.baseten.co/blog/announcing-our-series-f/) \"Announcing Our Series F\" event\n- [E3](https://github.com/basetenlabs/rlm) basetenlabs/rlm fork event (parent: alexzhang13/rlm)\n- [E4](https://github.com/basetenlabs/langchain-baseten/releases/tag/libs/baseten/v0.2.1) basetenlabs/langchain-baseten v0.2.1 release event\n- [E5](https://github.com/basetenlabs/ucx) basetenlabs/ucx fork event (parent: openucx/ucx)\n- [E6](https://github.com/basetenlabs/ucxx) basetenlabs/ucxx fork event (parent: rapidsai/ucxx)\n- [E7](https://www.baseten.co/blog/ai-training-vs-inference/) \"AI Training vs Inference\" event\n- [E8](https://www.baseten.co/resources/changelog/async-log-downloads/) \"Async Log Downloads\" event\n- [E9](https://www.baseten.co/blog/live-draft-model-training-for-speculative-decoding/) \"Live Draft Model Training for Speculative Decoding\" event\n- [E10](https://www.baseten.co/resources/changelog/model-deprecation-deepseek-v31-minimax-m25/) \"Model Deprecation DeepSeek V3.1 MiniMax M2.5\" event\n- [E11](https://github.com/basetenlabs/truss/releases/tag/v0.18.17) basetenlabs/truss v0.18.17 release event\n- [E12](https://github.com/basetenlabs/baseten-cli/releases/tag/v0.2.0) basetenlabs/baseten-cli v0.2.0 release event\n- [E13](https://jobs.ashbyhq.com/baseten/5a1c6228-3906-4ccc-9988-d9bd67383b9d) Field Productivity & Enablement Lead job event\n- [E14](https://www.baseten.co/blog/how-to-run-glm-52-in-any-harness/) \"How To Run GLM 5.2 In Any Harness\" event\n- [E15](https://jobs.ashbyhq.com/baseten/6d32aa11-ac93-4f90-8f62-bdeb79214ee5) Senior Analyst, Revenue Strategy & Operations job event\n- [E16](https://jobs.ashbyhq.com/baseten/3622a1ee-50a9-4c45-af6e-aa12bd5de22f) Senior Frontend Engineer job event\n- [E17](https://github.com/basetenlabs/truss/releases/tag/v0.18.16) basetenlabs/truss v0.18.16 release event\n- [E18](https://jobs.ashbyhq.com/baseten/132295d6-eeb4-4655-9847-a7e9a586d273) Partnerships Product Marketing Manager job event\n- [E19](https://github.com/basetenlabs/truss/releases/tag/v0.18.16rc0) basetenlabs/truss v0.18.16rc0 release event\n- [E20](https://www.baseten.co/blog/nvidia-bionemo-agent-toolkit-on-baseten/) \"Nvidia Bionemo Agent Toolkit On Baseten\" event\n- [E21](https://github.com/basetenlabs/truss/releases/tag/v0.18.15) basetenlabs/truss v0.18.15 release event\n- [E22](https://github.com/basetenlabs/container-debug-support) basetenlabs/container-debug-support fork event\n- [E23](https://github.com/basetenlabs/sw-example-ci-cd) basetenlabs/sw-example-ci-cd repo event\n- [E24](https://www.baseten.co/resources/changelog/filter-and-stream-model-logs-from-the-cli/) \"Filter And Stream Model Logs From The CLI\" event\n- [E25](https://github.com/basetenlabs/truss/releases/tag/v0.18.14) basetenlabs/truss v0.18.14 release event\n- [E26](https://github.com/basetenlabs/DeepGEMM) basetenlabs/DeepGEMM fork event (parent: deepseek-ai/DeepGEMM)\n- [E27](https://www.baseten.co/blog/the-best-open-source-large-language-models-llms/) \"The Best Open Source Large Language Models\" event\n- [E28](https://github.com/basetenlabs/truss/releases/tag/v0.18.13) basetenlabs/truss v0.18.13 release event\n- [E29](https://www.baseten.co/resources/changelog/glm-52-available-on-baseten/) \"GLM 5.2 Available On Baseten\" event\n- [E30](https://www.baseten.co/resources/changelog/kimi-k27-coder-on/) \"Kimi K2.7 Coder On\" event\n- [E31](https://github.com/basetenlabs/truss/releases/tag/v0.18.12) basetenlabs/truss v0.18.12 release event\n- [E32](https://github.com/basetenlabs/truss/releases/tag/v0.18.11) basetenlabs/truss v0.18.11 release event\n- [E33](https://jobs.ashbyhq.com/baseten/918dde84-09c6-4ee7-a0b9-a3e3253ac4b0) Capacity Strategy & Operations Lead job event\n- [E34](https://github.com/basetenlabs/truss/releases/tag/v0.18.10) basetenlabs/truss v0.18.10 release event\n- [E35](https://jobs.ashbyhq.com/baseten/902a7ddb-c21f-4272-aaab-879680697986) Software Engineer - Capacity job event\n- [E36](https://jobs.ashbyhq.com/baseten/6111c4a1-4e29-4fb8-aca3-1c8c8e0cbfb1) Customer Marketing Manager job event\n- [E37](https://www.baseten.co/blog/rolling-deployments-zero-downtime-model-updates/) \"Rolling Deployments Zero Downtime Model Updates\" event\n- [E38](https://www.baseten.co/resources/changelog/new-sidebar-navigation/) \"New Sidebar Navigation\" event\n- [E39](https://github.com/basetenlabs/truss/releases/tag/v0.18.9) basetenlabs/truss v0.18.9 release event\n- [E40](https://www.baseten.co/resources/changelog/container-restart-tracking/) \"Container Restart Tracking\" event\n- [E41](https://www.baseten.co/blog/mercury-2-is-now-available-on-baseten/) \"Mercury 2 Is Now Available On Baseten\" event\n- [E42](https://jobs.ashbyhq.com/baseten/2d78fdcf-53e1-45d3-a047-2aefb5ad3153) Product Manager, Developer Experience job event\n- [E43](https://github.com/basetenlabs/truss/releases/tag/v0.18.8) basetenlabs/truss v0.18.8 release event\n- [E44](https://www.baseten.co/resources/changelog/vllm-and-sglang-metrics/) \"vLLM And SGLang Metrics\" event\n- [E45](https://jobs.ashbyhq.com/baseten/7d9d5a1f-3834-434e-b22f-4bd62317be3c) Technical Program Manager, Infrastructure job event\n- [E46](https://github.com/basetenlabs/truss/releases/tag/v0.18.7) basetenlabs/truss v0.18.7 release event\n- [E47](https://jobs.ashbyhq.com/baseten/0870ed34-7365-4b9f-a50a-481783b8c266) Engineering Manager, Cloud Platform job event\n- [E48](https://jobs.ashbyhq.com/baseten/1e721b74-58b9-4b03-ac0f-4b1e5c32342e) Engineering Manager, Internal Platform job event\n- [E49](https://jobs.ashbyhq.com/baseten/aae72bd3-6f75-4238-9741-95fec11facb9) Engineering Manager, Runtime Fabric job event\n- [E50](https://www.baseten.co/resources/changelog/log-export-to-otlp-endpoints/) \"Log Export To OTLP Endpoints\" event\n- [E51](https://jobs.ashbyhq.com/baseten/71a011b6-0f17-4c0a-b0ba-38deffce1adb) Strategic Finance Associate / Sr. Associate job event\n- [E52](https://github.com/basetenlabs/baseten-go/releases/tag/v0.1.0) basetenlabs/baseten-go v0.1.0 release event\n- [E53](https://github.com/basetenlabs/baseten-python/releases/tag/v0.9.0) basetenlabs/baseten-python v0.9.0 release event\n- [E54](https://github.com/basetenlabs/ideogram4) basetenlabs/ideogram4 fork event (parent: ideogram-oss/ideogram4)\n- [E55](https://github.com/basetenlabs/tinker-cookbook) basetenlabs/tinker-cookbook fork event (parent: thinking-machines-lab/tinker-cookbook)\n- [E56](https://github.com/basetenlabs/runc) basetenlabs/runc fork event (parent: opencontainers/runc)\n- [E57](https://github.com/basetenlabs/compact-rl) basetenlabs/compact-rl fork event (parent: PrimeIntellect-ai/prime-rl)\n- [E58](https://github.com/basetenlabs/TorchSpec) basetenlabs/TorchSpec fork event (parent: lightseekorg/TorchSpec)\n- [E59](https://github.com/basetenlabs/mcore-bridge) basetenlabs/mcore-bridge fork event (parent: modelscope/mcore-bridge)\n- [E60](https://github.com/basetenlabs/autocomp) basetenlabs/autocomp fork event (parent: ucb-bar/autocomp)\n- [W1](https://www.baseten.co/blog/faster-image-generation-timestep-distillation-flux2/) \"Timestep distillation: 2.5x faster FLUX.2 image generation\" web\n- [W2](https://www.baseten.co/research/post-training-frontier-legal-agents-with-baseten-research/) \"Post-training frontier legal agents with Baseten Research\" web\n- [W3](https://www.baseten.co/blog/mai-thinking-1/) \"MAI-Thinking-1 is coming to Baseten\" web\n- [W4](https://www.verdict.co.uk/baseten-secures-1-5bn-series-f/) \"Baseten secures $1.5bn in Series F\" Verdict web","generated_at":"2026-06-27T18:45:40.723+00:00","citations":[{"url":"https://www.baseten.co/blog/introducing-model-apis-and-training/","path":null,"label":"baseten.co/blog","type":"external"},{"url":"https://www.verdict.co.uk/baseten-secures-1-5bn-series-f/","path":null,"label":"verdict.co.uk/baseten-secures-1-5bn-series-f","type":"external"},{"url":"https://www.baseten.co/blog/faster-image-generation-timestep-distillation-flux2/","path":null,"label":"baseten.co/blog","type":"external"},{"url":"https://www.baseten.co/research/post-training-frontier-legal-agents-with-baseten-research/","path":null,"label":"baseten.co/research","type":"external"},{"url":"https://www.baseten.co/blog/how-baseten-multi-cloud-capacity-management-mcm-powers-cloud-self-hosted-and-hybr/","path":null,"label":"baseten.co/blog","type":"external"},{"url":"https://github.com/basetenlabs/ucx","path":null,"label":"basetenlabs/ucx","type":"external"},{"url":"https://github.com/basetenlabs/ucxx","path":null,"label":"basetenlabs/ucxx","type":"external"},{"url":"https://github.com/basetenlabs/DeepGEMM","path":null,"label":"basetenlabs/DeepGEMM","type":"external"},{"url":"https://github.com/basetenlabs/compact-rl","path":null,"label":"basetenlabs/compact-rl","type":"external"},{"url":"https://jobs.ashbyhq.com/baseten/0870ed34-7365-4b9f-a50a-481783b8c266","path":null,"label":"jobs.ashbyhq.com/baseten","type":"external"},{"url":"https://jobs.ashbyhq.com/baseten/1e721b74-58b9-4b03-ac0f-4b1e5c32342e","path":null,"label":"jobs.ashbyhq.com/baseten","type":"external"},{"url":"https://jobs.ashbyhq.com/baseten/aae72bd3-6f75-4238-9741-95fec11facb9","path":null,"label":"jobs.ashbyhq.com/baseten","type":"external"},{"url":"https://jobs.ashbyhq.com/baseten/7d9d5a1f-3834-434e-b22f-4bd62317be3c","path":null,"label":"jobs.ashbyhq.com/baseten","type":"external"},{"url":"https://jobs.ashbyhq.com/baseten/918dde84-09c6-4ee7-a0b9-a3e3253ac4b0","path":null,"label":"jobs.ashbyhq.com/baseten","type":"external"},{"url":"https://jobs.ashbyhq.com/baseten/902a7ddb-c21f-4272-aaab-879680697986","path":null,"label":"jobs.ashbyhq.com/baseten","type":"external"},{"url":"https://jobs.ashbyhq.com/baseten/6d32aa11-ac93-4f90-8f62-bdeb79214ee5","path":null,"label":"jobs.ashbyhq.com/baseten","type":"external"},{"url":"https://jobs.ashbyhq.com/baseten/132295d6-eeb4-4655-9847-a7e9a586d273","path":null,"label":"jobs.ashbyhq.com/baseten","type":"external"},{"url":"https://jobs.ashbyhq.com/baseten/6111c4a1-4e29-4fb8-aca3-1c8c8e0cbfb1","path":null,"label":"jobs.ashbyhq.com/baseten","type":"external"},{"url":"https://jobs.ashbyhq.com/baseten/5a1c6228-3906-4ccc-9988-d9bd67383b9d","path":null,"label":"jobs.ashbyhq.com/baseten","type":"external"},{"url":"https://jobs.ashbyhq.com/baseten/2d78fdcf-53e1-45d3-a047-2aefb5ad3153","path":null,"label":"jobs.ashbyhq.com/baseten","type":"external"},{"url":"https://jobs.ashbyhq.com/baseten/3622a1ee-50a9-4c45-af6e-aa12bd5de22f","path":null,"label":"jobs.ashbyhq.com/baseten","type":"external"},{"url":"https://jobs.ashbyhq.com/baseten/71a011b6-0f17-4c0a-b0ba-38deffce1adb","path":null,"label":"jobs.ashbyhq.com/baseten","type":"external"},{"url":"https://github.com/basetenlabs/TorchSpec","path":null,"label":"basetenlabs/TorchSpec","type":"external"},{"url":"https://github.com/basetenlabs/autocomp","path":null,"label":"basetenlabs/autocomp","type":"external"},{"url":"https://github.com/basetenlabs/ideogram4","path":null,"label":"basetenlabs/ideogram4","type":"external"},{"url":"https://github.com/basetenlabs/action-junit-report","path":null,"label":"basetenlabs/action-junit-report","type":"external"},{"url":"https://github.com/basetenlabs/run-report-action","path":null,"label":"basetenlabs/run-report-action","type":"external"},{"url":"https://github.com/basetenlabs/run-report-action/releases/tag/v1","path":null,"label":"basetenlabs/run-report-action","type":"external"},{"url":"https://github.com/basetenlabs/container-debug-support","path":null,"label":"basetenlabs/container-debug-support","type":"external"},{"url":"https://github.com/basetenlabs/rlm","path":null,"label":"basetenlabs/rlm","type":"external"},{"url":"https://github.com/basetenlabs/truss/releases/tag/v0.18.7","path":null,"label":"basetenlabs/truss","type":"external"},{"url":"https://github.com/basetenlabs/truss/releases/tag/v0.18.8","path":null,"label":"basetenlabs/truss","type":"external"},{"url":"https://github.com/basetenlabs/truss/releases/tag/v0.18.9","path":null,"label":"basetenlabs/truss","type":"external"},{"url":"https://github.com/basetenlabs/truss/releases/tag/v0.18.10","path":null,"label":"basetenlabs/truss","type":"external"},{"url":"https://github.com/basetenlabs/truss/releases/tag/v0.18.11","path":null,"label":"basetenlabs/truss","type":"external"},{"url":"https://github.com/basetenlabs/truss/releases/tag/v0.18.12","path":null,"label":"basetenlabs/truss","type":"external"},{"url":"https://github.com/basetenlabs/truss/releases/tag/v0.18.13","path":null,"label":"basetenlabs/truss","type":"external"},{"url":"https://github.com/basetenlabs/truss/releases/tag/v0.18.14","path":null,"label":"basetenlabs/truss","type":"external"},{"url":"https://github.com/basetenlabs/truss/releases/tag/v0.18.15","path":null,"label":"basetenlabs/truss","type":"external"},{"url":"https://github.com/basetenlabs/truss/releases/tag/v0.18.16","path":null,"label":"basetenlabs/truss","type":"external"},{"url":"https://github.com/basetenlabs/truss/releases/tag/v0.18.17","path":null,"label":"basetenlabs/truss","type":"external"},{"url":"https://github.com/basetenlabs/truss/releases/tag/v0.18.16rc0","path":null,"label":"basetenlabs/truss","type":"external"},{"url":"https://github.com/basetenlabs/baseten-go/releases/tag/v0.1.0","path":null,"label":"basetenlabs/baseten-go","type":"external"},{"url":"https://github.com/basetenlabs/baseten-python/releases/tag/v0.9.0","path":null,"label":"basetenlabs/baseten-python","type":"external"},{"url":"https://github.com/basetenlabs/baseten-cli/releases/tag/v0.2.0","path":null,"label":"basetenlabs/baseten-cli","type":"external"},{"url":"https://github.com/basetenlabs/langchain-baseten/releases/tag/libs/baseten/v0.2.1","path":null,"label":"basetenlabs/langchain-baseten","type":"external"},{"url":"https://www.baseten.co/blog/announcing-our-series-f/","path":null,"label":"baseten.co/blog","type":"external"},{"url":"https://www.baseten.co/blog/announcing-baseten-75m-series-c/","path":null,"label":"baseten.co/blog","type":"external"},{"url":"https://www.baseten.co/blog/testing-llama-inference-performance-nvidia-gh200-lambda-cloud/","path":null,"label":"baseten.co/blog","type":"external"},{"url":"https://www.baseten.co/blog/accelerating-inference-nvidia-b200-gpus/","path":null,"label":"baseten.co/blog","type":"external"},{"url":"https://www.baseten.co/blog/day-zero-benchmarks-for-qwen-3-with-sglang-on-baseten/","path":null,"label":"baseten.co/blog","type":"external"},{"url":"https://www.baseten.co/blog/how-we-built-the-worlds-fastest-api-for-glm-52/","path":null,"label":"baseten.co/blog","type":"external"},{"url":"https://www.baseten.co/blog/introducing-baseten-embeddings-inference-bei/","path":null,"label":"baseten.co/blog","type":"external"},{"url":"https://www.baseten.co/blog/how-we-built-bei-high-throughput-embedding-inference/","path":null,"label":"baseten.co/blog","type":"external"},{"url":"https://www.baseten.co/blog/your-client-code-matters-10x-higher-embedding-throughput-with-python-and-rust/","path":null,"label":"baseten.co/blog","type":"external"},{"url":"https://www.baseten.co/blog/mai-thinking-1/","path":null,"label":"baseten.co/blog","type":"external"},{"url":"https://www.baseten.co/blog/live-draft-model-training-for-speculative-decoding/","path":null,"label":"baseten.co/blog","type":"external"},{"url":"https://www.baseten.co/blog/baseten-chains-for-production-compound-ai-systems/","path":null,"label":"baseten.co/blog","type":"external"},{"url":"https://www.baseten.co/resources/changelog/baseten-chains-is-now-ga-deploy-ultra-low-latency-compound-ai-at-scale/","path":null,"label":"baseten.co/resources","type":"external"},{"url":"https://www.baseten.co/blog/forward-deployed-engineering/","path":null,"label":"baseten.co/blog","type":"external"},{"url":"https://www.baseten.co/blog/a-checklist-for-switching-to-open-source-ml-models/","path":null,"label":"baseten.co/blog","type":"external"},{"url":"https://www.baseten.co/blog/deployment-and-inference-for-open-source-text-embedding-models/","path":null,"label":"baseten.co/blog","type":"external"},{"url":"https://www.baseten.co/blog/how-multi-node-inference-works-llms-deepseek-r1/","path":null,"label":"baseten.co/blog","type":"external"},{"url":"https://www.baseten.co/blog/the-best-open-source-large-language-models-llms/","path":null,"label":"baseten.co/blog","type":"external"},{"url":"https://www.baseten.co/blog/canopy-labs-selects-baseten-as-preferred-inference-provider-for-orpheus-tts-model/","path":null,"label":"baseten.co/blog","type":"external"},{"url":"https://www.baseten.co/blog/building-performant-embedding-workflows-with-chroma-and-baseten/","path":null,"label":"baseten.co/blog","type":"external"},{"url":"https://www.baseten.co/blog/nvidia-bionemo-agent-toolkit-on-baseten/","path":null,"label":"baseten.co/blog","type":"external"},{"url":"https://www.baseten.co/resources/changelog/stream-baseten-logs-from-terminal/","path":null,"label":"baseten.co/resources","type":"external"},{"url":"https://www.baseten.co/resources/changelog/flexible-instance-types-per-model-deployment/","path":null,"label":"baseten.co/resources","type":"external"},{"url":"https://www.baseten.co/resources/changelog/baseten-is-fully-openai-compatible/","path":null,"label":"baseten.co/resources","type":"external"},{"url":"https://www.baseten.co/resources/changelog/docs-refresh/","path":null,"label":"baseten.co/resources","type":"external"},{"url":"https://www.baseten.co/resources/changelog/async-log-downloads/","path":null,"label":"baseten.co/resources","type":"external"},{"url":"https://www.baseten.co/resources/changelog/log-export-to-otlp-endpoints/","path":null,"label":"baseten.co/resources","type":"external"},{"url":"https://www.baseten.co/blog/rolling-deployments-zero-downtime-model-updates/","path":null,"label":"baseten.co/blog","type":"external"},{"url":"https://www.baseten.co/resources/changelog/container-restart-tracking/","path":null,"label":"baseten.co/resources","type":"external"},{"url":"https://www.baseten.co/resources/changelog/vllm-and-sglang-metrics/","path":null,"label":"baseten.co/resources","type":"external"},{"url":"https://www.baseten.co/resources/changelog/filter-and-stream-model-logs-from-the-cli/","path":null,"label":"baseten.co/resources","type":"external"},{"url":"https://www.baseten.co/resources/changelog/glm-52-available-on-baseten/","path":null,"label":"baseten.co/resources","type":"external"},{"url":"https://www.baseten.co/resources/changelog/kimi-k27-coder-on/","path":null,"label":"baseten.co/resources","type":"external"},{"url":"https://www.baseten.co/blog/mercury-2-is-now-available-on-baseten/","path":null,"label":"baseten.co/blog","type":"external"},{"url":"https://www.baseten.co/resources/changelog/model-deprecation-deepseek-v31-minimax-m25/","path":null,"label":"baseten.co/resources","type":"external"},{"url":"https://www.baseten.co/resources/changelog/introducing-our-new-brand/","path":null,"label":"baseten.co/resources","type":"external"},{"url":"https://www.baseten.co/blog/introducing-our-new-brand/","path":null,"label":"baseten.co/blog","type":"external"},{"url":"https://github.com/basetenlabs/tinker-cookbook","path":null,"label":"basetenlabs/tinker-cookbook","type":"external"},{"url":"https://github.com/basetenlabs/mcore-bridge","path":null,"label":"basetenlabs/mcore-bridge","type":"external"},{"url":"https://github.com/basetenlabs/runc","path":null,"label":"basetenlabs/runc","type":"external"},{"url":"https://www.baseten.co/resources/changelog/early-access-announcing-b200s-on-baseten/","path":null,"label":"baseten.co/resources","type":"external"},{"url":"https://www.baseten.co/blog/ai-inference-explained/","path":null,"label":"baseten.co/blog","type":"external"},{"url":"https://www.baseten.co/blog/ai-training-vs-inference/","path":null,"label":"baseten.co/blog","type":"external"},{"url":"https://www.baseten.co/blog/how-to-run-glm-52-in-any-harness/","path":null,"label":"baseten.co/blog","type":"external"},{"url":"https://github.com/basetenlabs/sw-example-ci-cd","path":null,"label":"basetenlabs/sw-example-ci-cd","type":"external"},{"url":"https://www.baseten.co/resources/changelog/new-sidebar-navigation/","path":null,"label":"baseten.co/resources","type":"external"}],"provenance":{"provider":"deepseek","model":"deepseek-v4-pro","workflow":"onlylabs-deepagents-analysis-v3","agent":"deepagents"},"evidence":{"total":92,"pages":28,"events":140,"web":4,"signal_desks":{"forks":12,"repos":1,"hiring":13,"talking":18,"releases":16},"data_radar_lanes":null,"data_radar_matches":null}},"signal_counts":{"total":434,"model_released":0,"release":66,"repo_new":43,"repo_forked":67,"post_published":180,"job_opened":78}}