{"schema_version":"onlylabs.public_analysis_evidence.v1","title":"Together AI analysis evidence pack","description":"Public onlylabs evidence pack for cited agent analysis: captured pages, ranked public signals, and stored web-search provenance used by the background analysis workflow.","url":"https://onlylabs.fyi/analysis/together-ai","json_url":"https://onlylabs.fyi/analysis/together-ai/evidence.json","generated_at":"2026-06-27T22:36:19.278Z","org":{"slug":"together-ai","name":"Together AI","category":"neocloud","category_label":"Neocloud","dossier_url":"https://onlylabs.fyi/labs/together-ai"},"analysis":{"url":"https://onlylabs.fyi/analysis/together-ai","json_url":"https://onlylabs.fyi/analysis/together-ai/analysis.json","generated_at":"2026-06-27T18:44:11.988+00:00"},"workflow":{"version":"onlylabs-deepagents-analysis-v3","provider":"deepseek","model":"deepseek-v4-pro","agent":"deepagents","public_pack_mode":"local-pages-and-events","live_web_fetches":false,"note":"Public evidence exports do not trigger live Exa calls; stored Exa provenance is included when analysis metadata contains it."},"stats":{"pages":28,"events":140,"web":0,"evidence":88,"signal_desks":{"hiring":31,"forks":4,"releases":12,"talking":12,"repos":1},"data_radar_lanes":null,"data_radar_matches":null,"stored_analysis_evidence":92,"stored_analysis_web":4,"stored_analysis_signal_desks":{"forks":4,"repos":1,"hiring":31,"talking":12,"releases":12},"stored_analysis_data_radar_lanes":null,"stored_analysis_data_radar_matches":null},"stored_web_provenance":{"queries":["\"Together AI\" frontier AI lab recent model release research hiring GitHub Hugging Face","\"Together AI\" AI lab what they are building talking about hiring releasing forking"],"request_ids":["749bf7a06d200d03ecd717ea94efa012","ce5481db534498a12368437c7689940d"],"skipped":null},"evidence":[{"ref":"P1","kind":"page","title":"Head of Hyperscaler Partnerships","date":"2026-06-27T07:11:53.638405+00:00","date_source":null,"source_url":"https://job-boards.greenhouse.io/togetherai/jobs/5171124007","signal_url":null,"signal_json_url":null,"text":"Job Application for Head of Hyperscaler Partnerships at Together AI \nBack to jobs New \nHead of Hyperscaler Partnerships\nSan Francisco\n\nApply \nAbout the Role \n\nTogether AI is building the AI-native cloud — the fastest inference infrastructure on the planet, paired with a model ecosystem, orchestration layer, and data center footprint that enterprises and frontier labs depend on. As we deepen our relationships with the world's largest cloud platforms and technology ecosystems, we're hiring a Head of Hyperscaler Partnerships to lead these deals in our partner portfolio.\n\nThis is a principal-level role for a seasoned deal-maker who has navigated some of the most complex partnership structures in enterprise technology — across model licensing, software integrations, inference and model serving, and scaled cloud distribution. You will sit at the intersection of commercial strategy, product, and finance, owning end-to-end partnership cycles with hyperscalers, neoclouds, and platform partners that shape how Together AI's technology reaches the market.\n\nYou will bring deep experience inside a hyperscaler or have structured major deals with one. You understand how these organizations work from the inside — how decisions get made, which stakeholders matter, and how to unlock joint commercialization at scale. You will operate with significant autonomy, reporting into the VP of Strategic Partnerships and working closely with our CEO, CFO, CRO, and legal teams on deals that require board-level judgment.\n\nResponsibilities \n\nOwn the Full Deal Cycle for Hyperscaler Partnerships: Lead end-to-end partnership development with major cloud service providers. You will manage relationship-building through complex commercial negotiations, launch, and long-term expansion.\n\nNavigate Complex Orgs with Precision: Map and develop relationships across the full stakeholder matrix at partner organizations — from product and engineering to alliance managers, procurement, legal, and C-suite executives.\n\nDrive Commercial Structures That Create Durable Value: Design and negotiate deal structures across multiple surfaces, including revshare, marketplace private offers, and model licensing. You will "},{"ref":"P2","kind":"page","title":"Software Engineer(Amsterdam)","date":"2026-06-26T07:02:23.959977+00:00","date_source":null,"source_url":"https://job-boards.greenhouse.io/togetherai/jobs/5169470007","signal_url":null,"signal_json_url":null,"text":"Job Application for Software Engineer(Amsterdam) at Together AI \nBack to jobs tags.new \nSoftware Engineer(Amsterdam)\nAmsterdam\n\nApply \nAbout the Role \n\nTogether.ai is looking for a Software Engineer to join the Identity & Collaboration team — a great role for a full-stack or backend engineer who wants to grow into functional programming and the identity space. As part of the Product Foundations engineering group, the Identity & Collaboration team owns authentication flows (including SSO and OAuth), organizations, projects, API keys, and the role-based access controls that enable secure collaboration at scale.\n\nEvery customer interaction with Together relies on the systems we build. Whether it's a researcher accessing their models, an enterprise team collaborating on a shared project, or a developer making an API call, we make authentication seamless and invisible for simple cases while providing robust, enterprise-grade capabilities for complex organizational structures. Our work directly enables Together's growth from individual users to large enterprise teams, and we're actively building the next generation of collaboration features that will unlock new ways for customers to work together securely and efficiently across all Together products.\n\nLocation: Hybrid in Amsterdam, NL or remote UK, Ireland & Germany\n\nFull-time: This means 40 flexible hours, Monday through Friday.\n\nYou'll own well-defined features and small projects end-to-end, shipping work that lands in front of customers. You'll get guidance as you take on unfamiliar problem spaces, with plenty of room to grow toward more autonomy over time. We pair, review each other's code, and learn in the open — it's a strong environment to level up in.\n\nResponsibilities \n\nBuild and ship features across the stack — TypeScript/Next.js on the frontend and Elixir/Phoenix services on the backend (which you can grow into)\n\nOwn well-defined pieces of work end-to-end, from implementation through testing and rollout\n\nContribute to our identity and access features: SSO, OAuth, organizations, projects, API keys, and role-based access control\n\nLearn the Elixir/Erlang VM (BEAM) and how we run it in production\n\nParticipate m"},{"ref":"P3","kind":"page","title":"Product Manager, AI Infrastructure","date":"2026-06-25T07:02:36.323451+00:00","date_source":null,"source_url":"https://job-boards.greenhouse.io/togetherai/jobs/5172169007","signal_url":null,"signal_json_url":null,"text":"Job Application for Product Manager, AI Infrastructure at Together AI \nBack to jobs New \nProduct Manager, AI Infrastructure\nSan Francisco\n\nApply \nAbout the Role\n\nOur product surface is expanding fast - GPU clusters, managed storage, networking, and observability - and we're adding a Product Manager to the Together Cloud team to own the day-to-day product work that keeps these AI infrastructure products moving. You'll start across the full surface, with an early focus on observability and GPU Clusters, partnering closely with engineering to ship the feature work that customers feel every day.\n\nThis is a role for someone who runs toward problems. You won't be handed a backlog and asked to coordinate it - you'll find the issues others haven't spotted yet, drive them to resolution, and use data and experimentation to decide what to build. You'll have real autonomy from day one, and the breadth of working across compute, storage, and observability that few PM roles offer.\n\nIt's also a role with a clear path. Within roughly nine months, the goal is for you to grow into full ownership of a complete product area — observability or storage as your own. You'll report directly to a Staff Product Manager, and our product leadership (including our CPO) is closely involved with this team. If you want to build infrastructure that the AI ecosystem runs on, and earn ownership quickly by proving you can operate, this is that seat.\n\nResponsibilities\n\nOwn the day-to-day product and feature work across Together's AI infrastructure products - GPU Clusters, Managed Storage, and observability - with an early focus on observability and storage.\n\nFind and drive problems to resolution with minimal guidance, including issues that aren't yet on anyone's radar.\n\nRun structured, hypothesis-driven experiments - reading the data yourself and driving the data collection and instrumentation needed to answer open questions.\n\nPartner across engineering, product, and partner teams to ship improvements and unblock work without waiting for permission.\n\nTranslate a deep understanding of customers operating in fast-moving, high-ambiguity markets into product decisions.\n\nJuggle multiple priorities and wo"},{"ref":"P4","kind":"page","title":"togethercomputer/together-py v2.18.0","date":"2026-06-25T07:02:35.906706+00:00","date_source":null,"source_url":"https://github.com/togethercomputer/together-py/releases/tag/v2.18.0","signal_url":null,"signal_json_url":null,"text":"# v2.18.0\n\nRepository: togethercomputer/together-py\n\nTag: v2.18.0\n\nPublished: 2026-06-24T16:02:39Z\n\nPrerelease: no\n\nRelease notes:\n## [2.18.0](https://github.com/togethercomputer/together-py/compare/v2.17.0...v2.18.0) (2026-06-23)\n\n### Features\n\n* Add \"whoami\" cli command ([#414](https://github.com/togethercomputer/together-py/issues/414)) ([ce47771](https://github.com/togethercomputer/together-py/commit/ce477710df337345fdf0190ae607ed4c69b3ed25))\n* add /v1/whoami endpoint to OpenAPI spec ([575deea](https://github.com/togethercomputer/together-py/commit/575deea5b95a2070813c2e4dfca409bd3725918c))"},{"ref":"P5","kind":"page","title":"togethercomputer/together-py v2.19.0","date":"2026-06-25T07:02:35.619033+00:00","date_source":null,"source_url":"https://github.com/togethercomputer/together-py/releases/tag/v2.19.0","signal_url":null,"signal_json_url":null,"text":"# v2.19.0\n\nRepository: togethercomputer/together-py\n\nTag: v2.19.0\n\nPublished: 2026-06-24T21:26:38Z\n\nPrerelease: no\n\nRelease notes:\n## [2.19.0](https://github.com/togethercomputer/together-py/compare/v2.18.0...v2.19.0) (2026-06-24)\n\n### Features\n\n* **cli:** expose remediation approval mode ([#418](https://github.com/togethercomputer/together-py/issues/418)) ([d52a5c4](https://github.com/togethercomputer/together-py/commit/d52a5c481bc2eb6c6b467f0837f75d819cb927a5))\n\n### Chores\n\n* Clarify warning messages on unavailable price estimation for fine tuning ([#417](https://github.com/togethercomputer/together-py/issues/417)) ([2c0ee3c](https://github.com/togethercomputer/together-py/commit/2c0ee3c3b12472d1b617d477e9490ac262cde1cd))"},{"ref":"P6","kind":"page","title":"Platform Engineer, Model Shaping","date":"2026-06-24T07:04:46.145379+00:00","date_source":null,"source_url":"https://job-boards.greenhouse.io/togetherai/jobs/4790243007","signal_url":null,"signal_json_url":null,"text":"Job Application for Platform Engineer, Model Shaping at Together AI \nBack to jobs \nPlatform Engineer, Model Shaping\nSan Francisco \n\nApply \nAbout the Role \n\nThe Model Shaping team at Together AI works on products and research for tailoring open foundation models to downstream applications. We build services that allow machine learning developers to choose the best models for their tasks and further improve these models using domain-specific data. In addition to that, we develop new methods for more efficient model training and evaluation, drawing inspiration from a broad spectrum of ideas across machine learning, natural language processing, and ML systems.\n\nAs a Platform Engineer in Model Shaping, you will work at the intersection of backend engineering and infrastructure, building the foundational layers of Together’s platform for model customization and evaluation. You will design, develop, and operate both the backend services and the underlying systems that enable us to sustainably and reliably scale production workflows launched by our users, as well as internal research experiments.\n\nYou will operate in a cross-functional environment, collaborating with other engineers and researchers in the team to improve the infrastructure based on the needs of projects they work on. You will also interact with other engineering teams at Together (such as Commerce, Data Engineering, and Cloud Infrastructure) to integrate the services developed by Model Shaping with systems developed by those teams.\n\nResponsibilities \n\nDesign and build Together’s systems and infrastructure for model customization, including user-facing features and internal improvements\n\nContribute to reliability improvements for the platform, participating in an on-call rotation and improving processes for incident response\n\nCreate and improve internal tooling for deployment, continuous integration, and observability\n\nBuild a job orchestration platform spanning multiple datacenters, supporting a highly heterogeneous hardware landscape\n\nPartner with teams developing internal services, co-designing these services and incorporating them in systems built within Together\n\nRequirements \n\n3+ years of experienc"},{"ref":"P7","kind":"page","title":"ParallelKernelBench: Frontier LLMs can't write fast multi-GPU kernels (yet)","date":"2026-06-23T20:04:16.95523+00:00","date_source":null,"source_url":"https://www.together.ai/blog/parallelkernelbench","signal_url":null,"signal_json_url":null,"text":"ParallelKernelBench: Frontier LLMs can&#x27;t write fast multi-GPU kernels (yet) \n\n🚀 Now serving MiniMax-M3 for efficient inference →\n\n⚡ On-demand B200s now available on Together GPU Clusters →\n\n📊 Delivering 31% more TPS than the next-fastest OSS engine for production coding agent workloads →\n\n💬 How Together built the world&#x27;s fastest speech-to-text stack →\n\n🇫🇷 Join us at RAISE 2026 in Paris →\n\nAll blog posts\n\nResearch\n\nPublished 6/23/2026 \n\nParallelKernelBench: Frontier LLMs can&#x27;t write fast multi-GPU kernels (yet)\n\nThe best frontier model solves under a third of 87 real-world problems — but a few generated kernels beat anything publicly available.\n\nAuthors\n\nWilly Chan, Nathan Paek, Simon Guo, Simran Arora, Daniel Y. Fu\n\nTable of contents\n\n40+ Models Chosen for Production...40+ Models Chosen for Production...40+ Models Chosen for Production...\n\nLinks in this article\n\nPaper \nHuggingFace \nCode \n\nSummary\n\nLLMs have gotten surprisingly good at writing GPU kernels [1][2][3] , but almost all current benchmarks measuring that progress are single-GPU. In production, communication is often the bottleneck: communication overhead can account for over 20% of inference latency [4] , and that gap keeps widening as compute scales faster than interconnect bandwidth.\nParallelKernelBench (PKB) offers a benchmark and evaluation framework for multi-GPU kernel generation and includes 87 problems from real codebases where the task is replacing PyTorch + NCCL with a CUDA kernel that moves data directly over NVLink. We tested frontier coding models such as GPT-5.5, Gemini 3 Pro, Opus 4.7, and others. The evaluation revealed significant performance gaps across the board: under a third of problems were solved correctly, and fewer than a quarter of those beat the naive baseline.\nWe&#x27;ll cover why they fail, what the patterns look like, and a few cases where models surprisingly produced kernels faster than anything publicly available, including one for NVIDIA NeMo-RL&#x27;s GRPO training loop , which has no prior optimized public reference.\n\nWhy multi-GPU is different from single-GPU kernel generation\nLLMs have made progress on GPU kernel generation, but that progress has"},{"ref":"P8","kind":"page","title":"Research Intern RL & Post-Training Systems, Turbo (Fall 2026)","date":"2026-06-23T07:02:29.494466+00:00","date_source":null,"source_url":"https://job-boards.greenhouse.io/togetherai/jobs/5168929007","signal_url":null,"signal_json_url":null,"text":"Job Application for Research Intern RL & Post-Training Systems, Turbo (Fall 2026) at Together AI \nBack to jobs New \nResearch Intern RL & Post-Training Systems, Turbo (Fall 2026)\nSan Francisco\n\nApply \nAbout the Role \n\nThe Turbo Research team investigates how to make post-training and reinforcement learning for large language models efficient, scalable, and reliable . Our work sits at the intersection of RL algorithms , inference systems , and large-scale experimentation , where the cost and structure of inference dominate overall training efficiency and shape what learning algorithms are practical.\n\nAs a research intern, you will study RL and post-training methods whose performance and scalability are tightly coupled to inference behavior , co-designing algorithms and systems rather than treating them independently. Projects aim to unlock new regimes of experimentation—larger models, longer rollouts, and more complex evaluations—by rethinking how inference, scheduling, and training interact.\n\nRequirements \n\nPursuing a PhD or MS in Computer Science, EE, or a related field (exceptional undergraduates considered)\n\nHave research experience in one or more of:\n\nRL or post-training for large models (e.g., RLHF, RLAIF, GRPO, preference optimization)\n\nML systems (inference engines, runtimes, distributed systems)\n\nLarge-scale empirical ML research or evaluation\n\nAre comfortable with empirical research by designing controlled experiments, while interpreting noisy results and drawing principled conclusions\n\nCan work across abstraction layers:\n\nStrong Python skills for experimentation\n\nWillingness to modify inference or training systems (experience with C++, CUDA, or similar is a plus)\n\nExample Research Directions \n\nIntern projects are tailored to your background and interests, and may include:\n\nInference-Aware RL & Post-Training \n\nDesigning RL or preference-optimization objectives that explicitly account for inference cost and structure (e.g., speculative decoding, partial rollouts, controllable sampling).\n\nStudying how inference-time approximations affect learning dynamics in GRPO-, RLHF-, RLAIF-, or DPO-style methods.\n\nAnalyzing bias, variance, and stability trade-offs int"},{"ref":"P9","kind":"page","title":"togethercomputer/ParallelKernelBench repository metadata","date":"2026-06-23T07:02:27.807195+00:00","date_source":null,"source_url":"https://github.com/togethercomputer/ParallelKernelBench","signal_url":null,"signal_json_url":null,"text":"# togethercomputer/ParallelKernelBench\n\nLanguage: Python\n\nStars: 0\n\nForks: 0\n\nOpen issues: 6\n\nCreated: 2026-06-03T19:58:41Z\n\nPushed: 2026-06-23T06:09:38Z\n\nDefault branch: main\n\nFork: no\n\nArchived: no\n\nREADME:\n# ParallelKernelBench: Can LLMs write fast multi-GPU kernels?\n\n<img src=\"images/pkb.png\" width=\"400\" alt=\"ParallelKernelBench\" />\n\nParallelKernelBench (PKB) is a benchmark with the goal of enabling LLMs to optimize multi-GPU kernels. Specifically, we investigate model capabilities on turning existing PyTorch + NCCL reference code into fine-grained CUDA (or related DSLs).\n\nThe design is heavily inspired by [KernelBench](https://github.com/ScalingIntelligence/KernelBench).\n\n<p align=\"center\">\n📄 <a href=\"https://arxiv.org/abs/TBD\"><b>Paper</b></a> &nbsp;·&nbsp;\n🤗 <a href=\"https://huggingface.co/datasets/TBD/ParallelKernelBench\"><b>Hugging Face</b></a> &nbsp;·&nbsp;\n🌐 <a href=\"https://TBD\"><b>Project website</b></a>\n</p>\n\n---\n\n## 👋 Overview\n\n<img src=\"images/pkb_in_a_nutshell.png\" width=\"400\" alt=\"ParallelKernelBench process\" />\n\nPKB asks models to **optimize** multi-GPU kernels: each problem has a PyTorch + NCCL reference under `reference/`; candidates go in `solutions_<backend>/` (CUDA, Triton, ParallelKittens, or run-specific trees from generation).\n\n**Correctness:** `eval` mode runs reference and candidate on the same inputs and compares per-rank outputs (`rank_*.pt`) within `--atol` / `--rtol`.\n\n**Performance:** optional timing reports speedup vs reference. We follow [ThunderKittens 2 — benchmark rigor](https://hazyresearch.stanford.edu/blog/2026-02-19-tk-2): 500 warmup iterations, 100 timed iterations (see worker / perf utilities).\n\n**Roofline (approximate):** `reference_rooflines_code/` provides utilization estimates; contributions welcome.\n\n---\n\n## ⚙️ Setup\n\nPKB uses **[uv](https://docs.astral.sh/uv/)** for reproducible Python environments.\n\n### Prerequisites\n\n- **OS:** Linux with NVIDIA GPUs (multi-GPU runs need matching `torchrun` / NCCL).\n- **Driver:** Recent enough for CUDA 12.8 wheels (H100 nodes typically satisfy this).\n- **ParallelKittens backend (optional):** clone [ThunderKittens](https://github.com/HazyResearch/ThunderKittens) and set `THU"},{"ref":"P10","kind":"page","title":"togethercomputer/together-py v2.17.0","date":"2026-06-23T07:02:27.223218+00:00","date_source":null,"source_url":"https://github.com/togethercomputer/together-py/releases/tag/v2.17.0","signal_url":null,"signal_json_url":null,"text":"# v2.17.0\n\nRepository: togethercomputer/together-py\n\nTag: v2.17.0\n\nPublished: 2026-06-22T21:06:51Z\n\nPrerelease: no\n\nRelease notes:\n## [2.17.0](https://github.com/togethercomputer/together-py/compare/v2.16.1...v2.17.0) (2026-06-22)\n\n### Features\n\n* Add CLI commands for endpoint adapters ([#401](https://github.com/togethercomputer/together-py/issues/401)) ([a02606a](https://github.com/togethercomputer/together-py/commit/a02606a735958bf153ad8416b9664e1d058f2319))\n* Add CLI commands for endpoint adapters ([#401](https://github.com/togethercomputer/together-py/issues/401)) ([5d791eb](https://github.com/togethercomputer/together-py/commit/5d791eb058773faf55210d6481c57c5efd691be1))\n\n### Bug Fixes\n\n* **ci:** pass app token to every authenticated step in promote workflow ([1baa0a4](https://github.com/togethercomputer/together-py/commit/1baa0a4be94080fdc18907a589a45c6b263ecaba))\n* unbounded read into memory in upload path - ENG-89831 ([#399](https://github.com/togethercomputer/together-py/issues/399)) ([24177a8](https://github.com/togethercomputer/together-py/commit/24177a8ee190a93f5df9c03c6a8bb51a1506fcab))\n* use regular print for jig volumes progress update ([#395](https://github.com/togethercomputer/together-py/issues/395)) ([06d1776](https://github.com/togethercomputer/together-py/commit/06d1776cb2cdb68f618e5b56dadaeff452a6289e))\n\n### Chores\n\n* integrate production changes to staging repo ([#30](https://github.com/togethercomputer/together-py/issues/30)) ([f67f123](https://github.com/togethercomputer/together-py/commit/f67f123acf28094bc4ff3852905969f287e6de37))\n* Log warning when uploading a file that already exists ([3c87cba](https://github.com/togethercomputer/together-py/commit/3c87cba7c95a40762a5d2921497fdb6b2714d3f7))\n* Log warning when uploading a file that already exists ([ee2cee5](https://github.com/togethercomputer/together-py/commit/ee2cee5fc44315d6679afd6b4897a15873e204e2))\n* Update release-please token auth ([f77a46a](https://github.com/togethercomputer/together-py/commit/f77a46a7ca833abdafb52ebbc563e6d1a5a54749))\n* update scripts to use github app ([#32](https://github.com/togethercomputer/together-py/issues/32)) ([fc49a8b](https://github.com/togethercomp"},{"ref":"P11","kind":"page","title":"Workplace Coordinator","date":"2026-06-20T07:04:37.293797+00:00","date_source":null,"source_url":"https://job-boards.greenhouse.io/togetherai/jobs/5166459007","signal_url":null,"signal_json_url":null,"text":"Job Application for Workplace Coordinator at Together AI \nBack to jobs New \nWorkplace Coordinator\nSan Francisco\n\nApply \nAbout the Role \n\nAs a Workplace Coordinator in our San Francisco office, you will partner with Workplace Operations to manage day-to-day office operations and create an environment where employees can thrive. The ideal candidate is organized, personable, and brings strong problem-solving skills to everything they do. This position is based in our San Francisco office five days per week and is a great opportunity to make a significant impact in a fast paced environment. \n\nResponsibilities \n\nDaily Operations & Employee Support \n\nAssist with welcoming and onboarding new hires\n\nProvide on-site support and serve as a first point of contact for employee questions \n\nAssist the Workplace team with day-to-day activities as directed\n\nDeliver exceptional customer service through Gmail, Slack, and our Support Portal — for both in-office employees and remote employees\n\nGather and relay employee feedback on the workplace to help drive continuous improvement\n\nManage the swag program \n\nCover the reception desk as needed\n\nFacilities & Office Maintenance \n\nHandle incoming mail and package processing\n\nInitiate maintenance and repair calls to keep the office running efficiently\n\nLiaise with janitorial team to maintain office organization and cleanliness\n\nServe as a liaison to building management regarding facilities issues\n\nCoordinate badge and key fob distribution and access\n\nOversee guest access procedures\n\nServe as an emergency warden \n\nMaintain first aid supplies and ensure health and safety standards are met\n\nFood, Beverage & Events \n\nAssist with the Food & Beverage program\n\nHelp maintain kitchen cleanliness and ensure the snack and beverage program runs smoothly with our third-party vendor\n\nAssist with planning and executing internal events, both in-office and offsite, including team meetings, offsites, and employee celebrations\n\nPartner on logistics, vendor coordination, and on-site support for all internal events\n\nRequirements \n\nStrong work ethic with a commitment to reliability, dependability, and confidentiality\n\nExceptional attention to detail and time "},{"ref":"P12","kind":"page","title":"togethercomputer/together-sandbox together-sandbox-workspace-v3.0.0","date":"2026-06-19T07:02:38.590617+00:00","date_source":null,"source_url":"https://github.com/togethercomputer/together-sandbox/releases/tag/together-sandbox-workspace-v3.0.0","signal_url":null,"signal_json_url":null,"text":"# together-sandbox-workspace: v3.0.0\n\nRepository: togethercomputer/together-sandbox\n\nTag: together-sandbox-workspace-v3.0.0\n\nPublished: 2026-06-18T09:18:31Z\n\nPrerelease: no\n\nRelease notes:\n## [3.0.0](https://github.com/togethercomputer/together-sandbox/compare/together-sandbox-workspace-v2.0.0...together-sandbox-workspace-v3.0.0) (2026-06-18)\n\n### ⚠ BREAKING CHANGES\n\n* **CSB-1547:** `snapshots.list()` now returns a paginated `Page` of snapshots (TypeScript `Page<Snapshot>`, Python `Page[Snapshot]`) instead of a plain `Snapshot[]` / `list[Snapshot]`.\n\n### Features\n\n* **CSB-1547:** add cursor pagination to list endpoints ([1ee1d1d](https://github.com/togethercomputer/together-sandbox/commit/1ee1d1d3be89ecb0932049e8c40ecfa8223ca133))"},{"ref":"P13","kind":"page","title":"Senior Software Engineer(Amsterdam)","date":"2026-06-18T07:03:29.67467+00:00","date_source":null,"source_url":"https://job-boards.greenhouse.io/togetherai/jobs/5162910007","signal_url":null,"signal_json_url":null,"text":"Job Application for Senior Software Engineer(Amsterdam) at Together AI \nBack to jobs New \nSenior Software Engineer(Amsterdam)\nAmsterdam\n\nApply \nSenior Software Engineer, Identity & Collaboration\n\nTogether.ai is looking for a Senior Software Engineer to take a leading role in the authentication, authorization, and collaboration systems that every Together product depends on. As part of the Product Foundations engineering group, the Identity & Collaboration team owns authentication flows (including SSO and OAuth), organizations, projects, API keys, and the role-based access controls that enable secure collaboration at scale.\n\nEvery customer interaction with Together relies on the systems we build. Whether it's a researcher accessing their models, an enterprise team collaborating on a shared project, or a developer making an API call, we make authentication seamless and invisible for simple cases while providing robust, enterprise-grade capabilities for complex organizational structures. Our work directly enables Together's growth from individual users to large enterprise teams, and we're actively building the next generation of collaboration features that will unlock new ways for customers to work together securely and efficiently across all Together products.\n\nAbout the Role\n\nLocation: Hybrid in Amsterdam, NL \n\nFull-time: This means 40 flexible hours, Monday through Friday.\n\nYou'll work with a good deal of autonomy, owning meaningful pieces of our identity and access platform end-to-end, spotting problems worth solving, and contributing to the team's technical direction. You'll also help raise the bar around you through code and design review and by supporting more junior engineers.\n\nResponsibilities \n\nDesign and own authentication and authorisation systems end-to-end: SSO, OAuth/OIDC, SAML, organizations, projects, API keys, and role-based / attribute-based access control\n\nMake and document the technical decisions that shape how identity works across every Together product\n\nBuild across the stack — Elixir/Phoenix services on the backend and TypeScript/Next.js on the frontend — and the APIs other teams build on\n\nContribute directly to our Next.js product surface "},{"ref":"P14","kind":"page","title":"togethercomputer/together-typescript v0.41.2","date":"2026-06-18T07:03:29.485431+00:00","date_source":null,"source_url":"https://github.com/togethercomputer/together-typescript/releases/tag/v0.41.2","signal_url":null,"signal_json_url":null,"text":"# v0.41.2\n\nRepository: togethercomputer/together-typescript\n\nTag: v0.41.2\n\nPublished: 2026-06-17T16:39:52Z\n\nPrerelease: no\n\nRelease notes:\n## [0.41.2](https://github.com/togethercomputer/together-typescript/compare/v0.41.1...v0.41.2) (2026-06-17)\n\n### Chores\n\n* Add staging CI syncing ([#3](https://github.com/togethercomputer/together-typescript/issues/3)) ([aa0f6cd](https://github.com/togethercomputer/together-typescript/commit/aa0f6cd7ad27e1db0b2e9a47abd95b882a9a437d))\n* Add stlc promote action ([506f26f](https://github.com/togethercomputer/together-typescript/commit/506f26ff42aa8b782a569bf38ee7d6ffc08fe6ec))\n* Add stlc promote action ([a406a99](https://github.com/togethercomputer/together-typescript/commit/a406a99dae3ab8454eb06f9a483c6fff041bb7fa))\n* Bring back script changes that were lost by code seals ([ad85198](https://github.com/togethercomputer/together-typescript/commit/ad851981a3f4d2b7060f594dc009297cba7c2ca0))\n* fix production repo reference ([3c76ebc](https://github.com/togethercomputer/together-typescript/commit/3c76ebcbbb9788b08ad5e08d4a204441e59aec94))\n* fix production repo reference ([7e51b1a](https://github.com/togethercomputer/together-typescript/commit/7e51b1a1f77d9b53898f457980d279a48185936a))\n* Improve summary docs for remediations ([e72cc01](https://github.com/togethercomputer/together-typescript/commit/e72cc0118b91e74bb1306968fd35f26f7fe0ba1f))\n* integrate production changes to staging repo ([9552ea1](https://github.com/togethercomputer/together-typescript/commit/9552ea155a2c8261ee70172b29b10cef097acfe9))\n* rebuild sync code ([2ce12cb](https://github.com/togethercomputer/together-typescript/commit/2ce12cba1140b2ca9a2af0690fca6a5632bcc7f3))\n* simplify release-please script ([1214db5](https://github.com/togethercomputer/together-typescript/commit/1214db509ce3305ee4f029749b546a0c22830cad))\n* stlc integrate ([76d23e0](https://github.com/togethercomputer/together-typescript/commit/76d23e00bc150e867da20e3cc449eb45a7d42807))\n* sync custom code ([3003122](https://github.com/togethercomputer/together-typescript/commit/300312268e9cfea123d01bc8ae8befd57a2c76a8))\n* sync SDKs via stlc ([ae1bab7](https://github.com/togethercomputer/together-typescript/c"},{"ref":"P15","kind":"page","title":"Kimi K2.7 Code vs Claude Fable 5: Landing pages that cost 94% less","date":"2026-06-17T20:04:12.948121+00:00","date_source":null,"source_url":"https://www.together.ai/blog/kimi-k2-7-code-vs-claude-fable-5","signal_url":null,"signal_json_url":null,"text":"Kimi K2.7 Code vs Claude Fable 5: Landing pages that cost 94% less \n\n🚀 Now serving MiniMax-M3 for efficient inference →\n\n⚡ On-demand B200s now available on Together GPU Clusters →\n\n📊 Delivering 31% more TPS than the next-fastest OSS engine for production coding agent workloads →\n\n💬 How Together built the world&#x27;s fastest speech-to-text stack →\n\n🇫🇷 Join us at RAISE 2026 in Paris →\n\nAll blog posts\n\nInference\n\nPublished 6/17/2026 \n\nKimi K2.7 Code vs Claude Fable 5: Landing pages that cost 94% less\n\nAuthors\n\nHassan El Mghari\n\nTable of contents\n\n40+ Models Chosen for Production...40+ Models Chosen for Production...40+ Models Chosen for Production...\n\nSummary \nWe ran 12 landing pages through Kimi K2.7 Code and Claude Fable 5. Kimi cost 94% less and scored within a few points on nearly every page. Open-source models aren&#x27;t just cheaper, they&#x27;re genuinely competitive on quality. And the gap is closing faster than people realize. \n‍\n\nWe ran an experiment where we had Kimi K2.7 Code and Claude Fable 5 each produce 12 landing pages for a side‑by‑side comparison. Overall, Kimi K2.7 Code cost about 94% less than Fable 5 and yielded similar-quality output, especially after we gave Kimi the right context with a design MCP.\nWe published our findings on the OVSC website , along with all variants generated by Claude Opus 4.8, Claude Fable 5, and Kimi K2.7 Code. On average Kimi was ~16x cheaper than Fable and ~8x cheaper than Opus.\n\nThe OVSC website lets you explore all the landing pages along with breakdowns of total costs, token usage, and generation time.\nTo understand how we ran this experiment, we started by establishing a baseline and seeing what the model could produce from the prompt alone.\nThe prompts\nWe started with a small set of landing-page prompts across a few different categories, including B2B SaaS, a rooftop speakeasy, and a developer tool for SQL queries. Here&#x27;s a sample of the prompts we used:\nBuild a landing page for a developer tool that turns SQL queries into charts.\nBuild a landing page for a rooftop speakeasy cocktail bar - art deco, gold-leaf and emerald, 1920s glamour.\nBuild a landing page for a B2B SaaS startup - a team project-ma"},{"ref":"P16","kind":"page","title":"togethercomputer/together-py v2.16.1","date":"2026-06-17T07:02:37.96794+00:00","date_source":null,"source_url":"https://github.com/togethercomputer/together-py/releases/tag/v2.16.1","signal_url":null,"signal_json_url":null,"text":"# v2.16.1\n\nRepository: togethercomputer/together-py\n\nTag: v2.16.1\n\nPublished: 2026-06-16T12:26:34Z\n\nPrerelease: no\n\nRelease notes:\n## [2.16.1](https://github.com/togethercomputer/together-py/compare/v2.16.0...v2.16.1) (2026-06-10)\n\n### Chores\n\n* Add staging CI syncing ([#5](https://github.com/togethercomputer/together-py/issues/5)) ([3e314e5](https://github.com/togethercomputer/together-py/commit/3e314e59c1db68d6379972a9a9d9fa1f4dd2be00))\n* Add stlc promote action ([0dd26bd](https://github.com/togethercomputer/together-py/commit/0dd26bd284aedbcccd6ce4638e1e91953662c354))\n* Add stlc promote action ([8f092fd](https://github.com/togethercomputer/together-py/commit/8f092fd95f72d3f62d0964ce1f6e861d9168930c))\n* Fix lock files and type issue ([#4](https://github.com/togethercomputer/together-py/issues/4)) ([dd9b3cd](https://github.com/togethercomputer/together-py/commit/dd9b3cda988599ec29202368d63ac4d3e5ed5586))\n* fix production repo reference ([d25fc13](https://github.com/togethercomputer/together-py/commit/d25fc13d1d4cd9bdfc911998dbe6e2ca4389fb37))\n* Improve summary docs for remediations ([5bb4793](https://github.com/togethercomputer/together-py/commit/5bb479351550c5c84c1314634363ea5304bade8e))\n* sync custom code ([985e12e](https://github.com/togethercomputer/together-py/commit/985e12e6ad061c67e79e67cc96342d950bb33853))"},{"ref":"P17","kind":"page","title":"Research Intern, Model Shaping (Fall 2026)","date":"2026-06-16T07:02:10.870343+00:00","date_source":null,"source_url":"https://job-boards.greenhouse.io/togetherai/jobs/5157661007","signal_url":null,"signal_json_url":null,"text":"Job Application for Research Intern, Model Shaping (Fall 2026) at Together AI \nBack to jobs tags.new \nResearch Intern, Model Shaping (Fall 2026)\nSan Francisco\n\nApply \nAbout The Role \n\nAs a Research Intern in the Model Shaping team, you will work on one or more of the following areas:\n\nAdvanced post-training methods across supervised learning, preference optimization, and reinforcement learning\n\nNew techniques and systems for efficient training of neural networks (e.g., distributed training, algorithmic improvements, optimization methods)\n\nRobust and reliable evaluation of foundation model capabilities\n\nThe Model Shaping team at Together AI works on products and research for tailoring open foundation models to downstream applications. We build services that allow machine learning developers to choose the best models for their tasks and further improve these models using domain-specific data. In addition to that, we develop new methods for more efficient model training and evaluation, drawing inspiration from a broad spectrum of ideas across machine learning, natural language processing, and ML systems.\n\nPast research led by Model Shaping interns resulted in the following publications:\n\nEscaping the Verifier: Learning to Reason via Demonstrations (ICML 2026)\n\nUntied Ulysses: Memory-Efficient Context Parallelism via Headwise Chunking (ICML 2026)\n\n​​ FFT-based Dynamic Subspace Selection for Low-Rank Adaptive Optimization of Large Language Models (ICLR 2026)\n\nResponsibilities \n\nResearch and implement novel techniques in one or more of our focus areas\n\nDesign and conduct rigorous experiments to validate hypotheses\n\nDocument findings in scientific publications and blog posts\n\nIntegrate the research results into Together products\n\nRequirements \n\nCurrently pursuing a Bachelor's, Master's, or Ph.D. degree in Computer Science, Electrical Engineering, or a related field\n\nStrong knowledge of Machine Learning and Deep Learning fundamentals\n\nExperience with deep learning frameworks (PyTorch, JAX, etc.)\n\nFamiliarity with the Transformer architecture and recent developments in foundation models\n\nPreferred Requirements \n\nPrior research experience with training foundation models o"},{"ref":"P18","kind":"page","title":"Backend Engineer ","date":"2026-06-16T07:02:10.344698+00:00","date_source":null,"source_url":"https://job-boards.greenhouse.io/togetherai/jobs/5068767007","signal_url":null,"signal_json_url":null,"text":"Job Application for Backend Engineer at Together AI \nBack to jobs \nBackend Engineer \nAmsterdam\n\nApply \nAbout the role \n\nTogether AI/Codesandbox is looking for a Senior Backend/Distributed Systems engineer to help us build and maintain the codebase that powers the Together AI Sandbox service. This is a role for engineers that are familiar with standard backend architecture, database design, and high-performance backend services. Our API platform is under constant load and scrutiny, so a key part of the role is experience and commitment to writing easily understood and well-tested code. You will be working closely with the product team to understand and document the functional needs of their product requirements, developing new code to solve new problems, as well as maintaining existing code to squash bugs.\n\nLocation: Hybrid in Amsterdam, and must have be based in the Netherlands or have the relevant visa to be based in the Netherlands without sponsorship.\n\nFull-time: This means 40 flexible hours, Monday through Friday.\n\nResponsibilities \n\nDesign core, backend software components\n\nPerform architecture and research work for AI workloads\n\nInterface with other teams to incorporate their innovations\n\nAnalyze and improve efficiency, scalability, and stability of various system resources\n\nConduct design and code reviews\n\nCreate services, tools and developer documentation\n\nCreate testing frameworks for robustness and fault-tolerance\n\nParticipate in an on-call rotation to respond to critical incidents as needed\n\nRequirements \n\n5+ years experience writing high-performance, well-tested, production quality code\n\nBachelor's or Master's degree in Computer Science, Computer Engineering, relevant technical field, or equivalent practical experience\n\nDemonstrated experience in building large scale, fault tolerant, distributed systems like storage, search, and computation\n\nExpert level programmer in one or more of Golang, Java, Rust, or C/C++\n\nDemonstrated experience with relational (e.g., PostgreSQL) and non-relational (e.g., ClickHouse, Redis) databases\n\nExperience designing, analyzing and improving efficiency, scalability, and stability of various system resources\n\nExcellent und"},{"ref":"P19","kind":"page","title":"Research Intern, Inference (Fall 2026)","date":"2026-06-13T07:01:44.581491+00:00","date_source":null,"source_url":"https://job-boards.greenhouse.io/togetherai/jobs/5157759007","signal_url":null,"signal_json_url":null,"text":"Job Application for Research Intern, Inference (Fall 2026) at Together AI \nBack to jobs New \nResearch Intern, Inference (Fall 2026)\nSan Francisco\n\nApply \nAbout The Role \n\nThe Inference Research team is dedicated to building the next generation of efficient, scalable, and reliable serving systems for large foundation models, directly contributing to the mission of advancing open and transparent AI. Our work operates at the critical intersection of cutting-edge model architectures, high-performance systems engineering, and deep hardware optimization. We focus on co-designing software, algorithms, and models to significantly lower the cost and latency of modern AI systems.\n\nAs a research intern, you will dive into the complexities of distributed inference, compiler-aware optimization, and novel inference-time computation strategies (such as speculative decoding and phase-aware execution). You will be tasked with co-designing and implementing cross-layer optimizations across models, systems, and hardware, with a focus on areas like KV cache design and large-scale serving architectures.\n\nProjects aim to unlock unprecedented performance and scale for foundation models, enabling faster serving, larger model deployment (e.g., Mixture-of-Experts), and robust, reproducible evaluation under realistic serving workloads.\n\nResponsibilities \n\nDesign and conduct rigorous experiments to validate hypotheses\n\nCommunicate the plans, progress, and results of projects to the broader team\n\nDocument findings in scientific publications and blog posts\n\nRequirements \n\nCurrently pursuing a final year of Bachelor's, Master's, or Ph.D. degree in Computer Science, Electrical Engineering, or a related field\n\nStrong knowledge of Machine Learning and Deep Learning fundamentals\n\nExperience with deep learning frameworks (PyTorch, JAX, etc.)\n\nStrong programming skills in Python\n\nFamiliarity with Transformer architectures and recent developments in foundation models\n\nPreferred Qualifications \n\nPrior research experience in foundation models , efficient machine learning , or ML systems .\n\nPublications at leading conferences in machine learning or systems (i.e., MLSys, ICLR ).\n\nExperience with CUDA pro"},{"ref":"P20","kind":"page","title":"Frontier Agents Intern (Fall 2026)","date":"2026-06-13T07:01:44.539117+00:00","date_source":null,"source_url":"https://job-boards.greenhouse.io/togetherai/jobs/5157380007","signal_url":null,"signal_json_url":null,"text":"Job Application for Frontier Agents Intern (Fall 2026) at Together AI \nBack to jobs New \nFrontier Agents Intern (Fall 2026)\nSan Francisco\n\nApply \nAbout the Role \n\nThe Agents team investigates how to build, align, and scale frontier AI systems that can tackle complex, multi-step tasks and workflows across text and speech, with a particular focus on agentic and scientific domains. Our work sits at the intersection of agent capabilities, human-computer interaction, and infrastructure—from designing post-training methods for agentic behavior to developing evaluation frameworks for open-ended tasks where traditional metrics fall short. \n\nAs a research intern, you will work on problems at the frontier of agentic AI, where challenges in alignment, reliability, and scalability are deeply intertwined. Projects may involve developing new training recipes for self-learning and long-horizon reasoning, curating datasets for non-deterministic scientific and agentic tasks, studying failure modes in agentic behavior, or building infrastructure that enables agent operations at scale. You'll operate in a space where algorithmic innovation, dataset and interaction design, and systems work come together to push the boundaries of what AI agents can reliably accomplish. \n\nResponsibilities \n\nResearch and implement novel techniques in one or more of our focus areas \n\nDesign and conduct rigorous experiments to validate hypotheses \n\nDocument findings in scientific publications and blog posts \n\nCommunicate the plans, progress, and results of projects to the broader team \n\nRequirements \n\nCurrently pursuing a Masters or Ph.D. degree in Computer Science, Electrical Engineering, Information Science, or a related field \n\nPublications at leading ML, NLP, or speech conferences or journals (such as NeurIPS, ICML, ICLR, *ACL, EMNLP, Interspeech) \n\nStrong knowledge of Machine Learning and Deep Learning fundamentals \n\nExperience with deep learning frameworks (PyTorch, JAX, etc.) \n\nUnderstanding of how LLMs work \n\nStrong programming skills in Python \n\nFamiliarity with Transformer architectures and recent developments in foundation models \n\nExample Research Directions \n\nTraining, developing, and evalu"},{"ref":"P21","kind":"page","title":"Systems Research Engineer Intern - GPU Programming (Fall 2026)","date":"2026-06-13T07:01:44.230058+00:00","date_source":null,"source_url":"https://job-boards.greenhouse.io/togetherai/jobs/5157559007","signal_url":null,"signal_json_url":null,"text":"Job Application for Systems Research Engineer Intern - GPU Programming (Fall 2026) at Together AI \nBack to jobs New \nSystems Research Engineer Intern - GPU Programming (Fall 2026)\nSan Francisco\n\nApply \nAbout The Role \n\nAs a Systems Research Engineer Intern specialized in GPU Programming, you will play a crucial role in developing and optimizing GPU-accelerated kernels and algorithms for ML/AI applications. Working closely with the modeling and algorithm team, you will co-design GPU kernels and model architecture to enhance the performance and efficiency of our AI systems. Collaborating with the hardware and software teams, you will contribute to the co-design of efficient GPU architectures and programming models, leveraging your expertise in GPU programming and parallel computing. Your research skills will be vital in staying up-to-date with the latest advancements in GPU programming techniques, ensuring that our AI infrastructure remains at the forefront of innovation.\n\nResponsibilities \n\nOptimize and fine-tune GPU code to achieve better performance and scalability\n\nCollaborate with cross-functional teams to integrate GPU-accelerated solutions into existing software systems\n\nStay up-to-date with the latest advancements in GPU programming techniques and technologies\n\nRequirements \n\nStrong background in GPU programming and parallel computing, such as CUDA and/or Triton.\n\nKnowledge of ML/AI applications and models\n\nKnowledge of performance profiling and optimization tools for GPU programming\n\nExcellent problem-solving and analytical skills\n\nInternship Program Details \n\nOur fall internship program spans over 12 to 16 weeks where you’ll have the opportunity to work with industry-leading engineers building a cloud from the ground up and possibly contribute to influential open source projects. Our internship dates are September 14th to December 18th. \n\nAbout Together AI \n\nTogether AI is a research-driven artificial intelligence company. We believe open and transparent AI systems will drive innovation and create the best outcomes for society, and together we are on a mission to significantly lower the cost of modern AI systems by co-designing software, algorithms, and "},{"ref":"P22","kind":"page","title":"Data Center Operations Coordinator","date":"2026-06-12T07:03:33.535329+00:00","date_source":null,"source_url":"https://job-boards.greenhouse.io/togetherai/jobs/5160139007","signal_url":null,"signal_json_url":null,"text":"Job Application for Data Center Operations Coordinator at Together AI \nBack to jobs New \nData Center Operations Coordinator\nSan Francisco\n\nApply \nAbout the Role \n\nWe’re looking for a detail-oriented Data Center Operations professional to manage and track all break/fix activities across multiple data center locations. This role acts as the central point of coordination for hardware incidents, vendor dispatches, ticket management, asset tracking, and operational reporting to ensure maximum uptime and fast issue resolution.\n\nResponsibilities \n\nTrack and manage all break/fix incidents across multiple data centers\n\nMonitor ticket queues and ensure SLA compliance for incident response and resolution\n\nCoordinate with on-site technicians, remote hands teams, vendors, and engineering groups\n\nMaintain accurate records of failed hardware, replacements, RMAs, and repair status\n\nEscalate critical outages and recurring infrastructure issues to leadership and engineering teams\n\nSchedule and oversee maintenance windows and emergency repair activities\n\nProvide daily/weekly operational status reports and incident summaries\n\nEnsure all work follows data center operational procedures and change management policies\n\nIdentify trends in hardware failures and recommend process improvements\n\nRequirements \n\nExperience working in data center operations, IT infrastructure, or hardware support\n\nStrong understanding of server, storage, and networking hardware\n\nExperience with ticketing systems such as ServiceNow, Jira, or Remedy\n\nAbility to manage multiple priorities across several sites simultaneously\n\nExcellent communication and organizational skills\n\nFamiliarity with SLA management and incident escalation processes\n\nProficiency with Excel, reporting dashboards, and inventory tracking tools\n\nPreferred Qualifications\n\nExperience supporting enterprise or hyperscale data centers\n\nKnowledge of remote hands operations and vendor management\n\nUnderstanding of ITIL processes and change management\n\nCompTIA Server+, Network+, or similar certifications\n\nAbout Together AI \n\nTogether AI is a research-driven AI infrastructure company on a mission to dramatically lower the cost of modern AI by co-designi"},{"ref":"P23","kind":"page","title":"togethercomputer/InferenceX repository metadata","date":"2026-06-12T07:03:32.197941+00:00","date_source":null,"source_url":"https://github.com/togethercomputer/InferenceX","signal_url":null,"signal_json_url":null,"text":"# togethercomputer/InferenceX\n\nDescription: Open Source Continuous Inference Benchmark Research Platform Kimi K2.6, DeepSeekv4, GLM5 - GB200 NVL72 vs MI355X vs B200 vs GB300 NVL72 & soon™ TPUv6e/v7/Trainium2/3\n\nLicense: Apache-2.0\n\nStars: 0\n\nForks: 0\n\nOpen issues: 0\n\nCreated: 2026-06-11T20:56:28Z\n\nPushed: 2026-06-12T01:17:45Z\n\nDefault branch: main\n\nFork: yes\n\nParent repository: SemiAnalysisAI/InferenceX\n\nArchived: no\n\nREADME:\n# InferenceX™, Open Source Continuous Inference Standard and Research Platform \n## Trusted by Operators of Trillion Dollar Token Factories such as OpenAI, Microsoft, Oracle, etc, & ML Community such as PyTorch Foundation, vLLM, SGLang, Tri Dao\n\n[![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://github.com/SemiAnalysisAI/InferenceX/blob/main/LICENSE)\n[![PRs Welcome](https://img.shields.io/badge/PRs-welcome-brightgreen.svg)](https://github.com/SemiAnalysisAI/InferenceX/pulls)\n[![GitHub Stars](https://img.shields.io/github/stars/SemiAnalysisAI/InferenceX?style=social)](https://github.com/SemiAnalysisAI/InferenceX)\n\nInferenceX™ (formerly InferenceMAX) is an inference performance research platform dedicated to continually analyzing & benchmarking the world’s most popular open-source inference frameworks used by major token factories and models to track real performance in real time. As these software stacks improve, InferenceX™ captures that progress in near real-time, providing a live indicator of inference performance progress. A [open sourced](https://github.com/SemiAnalysisAI/InferenceX-app) live dashboard is available for free publicly at https://inferencex.com/. \n\n> [!IMPORTANT]\n> Only [SemiAnalysisAI/InferenceX](https://github.com/SemiAnalysisAI/InferenceX) repo contains the Official InferenceX™ result, all other forks & repos are Unofficial. The benchmark setup & quality of machines/clouds in unofficial repos may be differ leading to subpar benchmarking. Unofficial must be explicitly labelled as Unofficial.\n> Forks may not remove this disclaimer\n\n[Full Article Write Up for InferenceXv2](https://newsletter.semianalysis.com/p/inferencex-v2-nvidia-blackwell-vs)\n[Full Article Write Up for InferenceXv1](https://n"},{"ref":"P24","kind":"page","title":"Building trust in enterprise AI: Together AI earns ISO 27001:2022 certification","date":"2026-06-11T07:04:02.334072+00:00","date_source":null,"source_url":"https://www.together.ai/blog/iso-27001-2022-certification","signal_url":null,"signal_json_url":null,"text":"Building trust in enterprise AI: Together AI earns ISO 27001:2022 certification \n\n🚀 Now serving MiniMax-M3 for efficient inference →\n\n⚡ On-demand B200s now available on Together GPU Clusters →\n\n📊 Delivering 31% more TPS than the next-fastest OSS engine for production coding agent workloads →\n\n💬 How Together built the world&#x27;s fastest speech-to-text stack →\n\n🇫🇷 Join us at RAISE 2026 in Paris →\n\nAll blog posts\n\nCompany\n\nPublished 6/10/2026 \n\nBuilding trust in enterprise AI: Together AI earns ISO 27001:2022 certification\n\nISO 27001:2022 builds on our existing compliance program and reinforces our commitment to helping customers run production-grade AI workloads on secure, well-governed infrastructure.\n\nAuthors\n\nLisa Ruggiero, Derek Chamorro\n\nTable of contents\n\n40+ Models Chosen for Production...40+ Models Chosen for Production...40+ Models Chosen for Production...\n\nLinks in this article\n\nTrust Center \n\nSummary\n\nTogether AI has received an ISO 27001:2022 certification from A-LIGN Compliance and Security, Inc., an ANAB-accredited certification body, confirming that our Information Security Management System (ISMS) meets the latest international standard for information security management. This milestone builds on our existing compliance program and reinforces our commitment to helping customers run production-grade AI workloads on secure, well-governed infrastructure.\nThe certification reflects a comprehensive, multi-month assessment of how we manage risk, secure data, and continuously improve security across our organization and platform.\n\nScope of the certification\nThe ISO 27001:2022 certification is scoped to the ISMS supporting Together AI’s global platform, including the systems, processes, and controls that protect customer data and platform operations. This scope covers our corporate headquarters as well as the security of third‑party data centers that provide hosting and colocation services for Together AI’s infrastructure.\nWhat ISO means for customers\nISO 27001:2022 is the leading international standard for establishing, implementing, maintaining, and continually improving an ISMS. By certifying our ISMS to this standard, an independent auditor has"},{"ref":"P25","kind":"page","title":"togethercomputer/Port_FasterTransformer repository metadata","date":"2026-06-11T04:19:31.952944+00:00","date_source":null,"source_url":"https://github.com/togethercomputer/Port_FasterTransformer","signal_url":null,"signal_json_url":null,"text":"# togethercomputer/Port_FasterTransformer\n\nDescription: Transformer related optimization, including BERT, GPT\n\nLanguage: C++\n\nLicense: Apache-2.0\n\nStars: 1\n\nForks: 1\n\nOpen issues: 1\n\nCreated: 2022-10-30T13:37:30Z\n\nPushed: 2023-05-28T04:48:28Z\n\nDefault branch: main\n\nFork: yes\n\nParent repository: NVIDIA/FasterTransformer\n\nArchived: no\n\nREADME:\n# Port_FasterTransformer \n\nTo bring up a standalone node:\n\n```console\nmkdir .together\ndocker run --rm --gpus all --ipc=host \\\n-e NUM_WORKERS=auto \\\n-e CUDA_VISIBLE_DEVICES=$CUDA_VISIBLE_DEVICES \\\n-v $PWD/.together:/home/user/.together \\\n-it togethercomputer/fastertransformer /usr/local/bin/together start \\\n--color --config /home/user/cfg.yaml --worker.model GPT-JT-6B-v1-tp1\n```\n\n```console\ndocker run --rm --gpus '\"device=3,4\"' --ipc=host \\\n-e NUM_WORKERS=auto \\\n-v $PWD/.together:/home/user/.together \\\n-it togethercomputer/fastertransformer /usr/local/bin/together start \\\n--color --config /home/user/cfg.yaml --worker.model opt-13b-tp2\n```\n\n# Development commands\n\n```console\ndocker build -t port_ft_gpt_jt -f GPT-JT-Dockerfile \n\nnvidia-docker run --ipc=host --network=host --name port_ft -ti -v /root/fm/models/ft_model:/workspace/Port_FasterTransformer/build/model -v /root/fm/dev/Port_FasterTransformer/examples/:/workspace/Port_FasterTransformer/examples -v /root/fm/dev/Port_FasterTransformer/src/fastertransformer:/workspace/Port_FasterTransformer/src/fastertransformer port_ft bash\n\nnvidia-docker run --ipc=host --network=host --name port_ft -ti -v /home/binhang/active/ft_model:/workspace/Port_FasterTransformer/build/model -v /home/binhang/active/Port_FasterTransformer/examples:/workspace/Port_FasterTransformer/examples -v /home/binhang/active/Port_FasterTransformer/src/fastertransformer:/workspace/Port_FasterTransformer/src/fastertransformer port_fasttransformer bash\n\nmpirun -n 8 --allow-run-as-root python /workspace/Port_FasterTransformer/examples/pytorch/gpt/port_opt_inference.py --weights_data_type fp16 --data_type fp16 --vocab_size 50272 --max_batch_size 1 --max_seq_len 2048 --tensor_para_size 8 --ckpt_path /workspace/Port_FasterTransformer/build/model/opt-66b-fp16-tp8/8-gpu --lib_path /workspace/Port_FasterTransformer/build"},{"ref":"P26","kind":"page","title":"togethercomputer/flash-attention repository metadata","date":"2026-06-11T04:19:31.909418+00:00","date_source":null,"source_url":"https://github.com/togethercomputer/flash-attention","signal_url":null,"signal_json_url":null,"text":"# togethercomputer/flash-attention\n\nDescription: Fast and memory-efficient exact attention\n\nLanguage: Python\n\nLicense: BSD-3-Clause\n\nStars: 1\n\nForks: 1\n\nOpen issues: 0\n\nCreated: 2022-11-22T23:05:11Z\n\nPushed: 2023-08-30T18:03:03Z\n\nDefault branch: main\n\nFork: yes\n\nParent repository: Dao-AILab/flash-attention\n\nArchived: no\n\nREADME:\n# FlashAttention\nThis repository provides the official implementation of FlashAttention and\nFlashAttention-2 from the\nfollowing papers.\n\n**FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness** \nTri Dao, Daniel Y. Fu, Stefano Ermon, Atri Rudra, Christopher Ré \nPaper: https://arxiv.org/abs/2205.14135 \nIEEE Spectrum [article](https://spectrum.ieee.org/mlperf-rankings-2022) about our submission to the MLPerf 2.0 benchmark using FlashAttention.\n![FlashAttention](assets/flashattn_banner.jpg)\n\n**FlashAttention-2: Faster Attention with Better Parallelism and Work Partitioning** \nTri Dao\n\nPaper: https://tridao.me/publications/flash2/flash2.pdf\n\n![FlashAttention-2](assets/flashattention_logo.png)\n\n## Usage\n\nWe've been very happy to see FlashAttention being widely adopted in such a short\ntime after its release. This [page](https://github.com/Dao-AILab/flash-attention/blob/main/usage.md)\ncontains a partial list of places where FlashAttention is being used.\n\nFlashAttention and FlashAttention-2 are free to use and modify (see LICENSE).\nPlease cite and credit FlashAttention if you use it.\n\n## Installation and features\n\nRequirements:\n- CUDA 11.4 and above.\n- PyTorch 1.12 and above.\n\nWe recommend the\n[Pytorch](https://catalog.ngc.nvidia.com/orgs/nvidia/containers/pytorch)\ncontainer from Nvidia, which has all the required tools to install FlashAttention.\n\nTo install:\n1. Make sure that PyTorch is installed.\n2. Make sure that `packaging` is installed (`pip install packaging`)\n3. Make sure that `ninja` is installed and that it works correctly (e.g. `ninja\n--version` then `echo $?` should return exit code 0). If not (sometimes `ninja\n--version` then `echo $?` returns a nonzero exit code), uninstall then reinstall\n`ninja` (`pip uninstall -y ninja && pip install ninja`). Without `ninja`,\ncompiling can take a very long time (2h) since it "},{"ref":"P27","kind":"page","title":"togethercomputer/diffusers repository metadata","date":"2026-06-11T04:19:31.766164+00:00","date_source":null,"source_url":"https://github.com/togethercomputer/diffusers","signal_url":null,"signal_json_url":null,"text":"# togethercomputer/diffusers\n\nDescription: 🤗 Diffusers: State-of-the-art diffusion models for image and audio generation in PyTorch\n\nLanguage: Python\n\nLicense: Apache-2.0\n\nStars: 4\n\nForks: 3\n\nOpen issues: 2\n\nCreated: 2022-11-22T23:05:32Z\n\nPushed: 2026-05-08T13:54:43Z\n\nDefault branch: main\n\nFork: yes\n\nParent repository: HazyResearch/diffusers\n\nArchived: no\n\nREADME:\n# Diffusers + FlashAttention\n\nThis is a branch of [HuggingFace Diffusers](https://github.com/huggingface/diffusers) to incorporate FlashAttention, optimized for high throughput.\n\n**Update 10/31/22**: Easier install! You can either run from our Docker image, or install from source. Bonus: We no longer rely on the cutlass branch of FlashAttention!\n\n## Installation\n\n**From our Docker image:**\n\nYou can run from our [Docker image](https://hub.docker.com/layers/danfu09/diffusers/0.1/images/sha256-033c41564f01894e922f93018e87046c8719ce63bd942939fb1d38627577811e?context=explore):\n```\ndocker run -it --rm --gpus all danfu09/diffusers:0.1 zsh\nhuggingface-cli login\ncd diffusers\npython test.py --batch_size 1 # how many images to generate at once\n```\n\n**To install from source:**\n\nFlashAttention requires CUDA 11, NVCC, and a Turing or Ampere GPU.\nTo install FlashAttention:\n```\ngit clone https://github.com/HazyResearch/flash-attention.git\ncd flash-attention\ngit submodule init\ngit submodule update\npython setup.py install\ncd ..\n```\n\nTo install diffusers:\n```\ngit clone https://github.com/HazyResearch/diffusers.git\ncd diffusers\npip install -e .\n```\n\n## Running\n\nA sample benchmark, following HuggingFace's [benchmark](https://twitter.com/Nouamanetazi/status/1576959648912973826) of diffusers:\n```Python\nimport time\nimport torch\nfrom diffusers import StableDiffusionPipeline\nimport functools\n\n# torch disable grad\ntorch.set_grad_enabled(False)\n\ntorch.manual_seed(1231)\ntorch.cuda.manual_seed(1231)\n\nprompt = \"a photo of an astronaut riding a horse on mars\"\n\n# cudnn benchmarking\ntorch.backends.cudnn.benchmark = True\n\n# make sure you're logged in with `huggingface-cli login`\npipe = StableDiffusionPipeline.from_pretrained(\n\"CompVis/stable-diffusion-v1-4\", \nuse_auth_token=True,\nrevision=\"fp16\",\ntorch_dtype=torch.float16\n).to(\"cuda\")\n"},{"ref":"P28","kind":"page","title":"togethercomputer/together-chat repository metadata","date":"2026-06-11T04:19:31.658895+00:00","date_source":null,"source_url":"https://github.com/togethercomputer/together-chat","signal_url":null,"signal_json_url":null,"text":"# togethercomputer/together-chat\n\nDescription: Streamlit Component, for a Chatbot UI\n\nLicense: MIT\n\nStars: 2\n\nForks: 0\n\nOpen issues: 14\n\nCreated: 2023-01-17T16:51:32Z\n\nPushed: 2024-06-18T02:53:02Z\n\nDefault branch: main\n\nFork: yes\n\nParent repository: AI-Yash/st-chat\n\nArchived: no\n\nREADME:\n# st-chat\n\nStreamlit Component, for a Chat-bot UI, [example app](https://share.streamlit.io/ai-yash/st-chat/main/examples/chatbot.py)\n\nauthors - [@yashppawar](https://github.com/yashppawar) & [@YashVardhan-AI](https://github.com/yashvardhan-ai)\n\n## Installation\n\nInstall `streamlit-chat` with pip\n```bash\npip install streamlit-chat \n```\n\nusage, import the `message` function from `streamlit_chat`\n```py\nimport streamlit as st\nfrom streamlit_chat import message\n\nmessage(\"My message\") \nmessage(\"Hello bot!\", is_user=True) # align's the message to the right\n```"},{"ref":"E1","kind":"event","title":"togethercomputer/together-py v2.20.0","date":"2026-06-26T20:09:05+00:00","date_source":"source","source_url":"https://github.com/togethercomputer/together-py/releases/tag/v2.20.0","signal_url":"https://onlylabs.fyi/signals/62883a63-dd21-4e56-bb56-2d2f0a504983","signal_json_url":"https://onlylabs.fyi/signals/62883a63-dd21-4e56-bb56-2d2f0a504983/signal.json","text":"release · togethercomputer/together-py v2.20.0 · signal_desk=releases · occurred_at=2026-06-26T20:09:05+00:00 · url=https://github.com/togethercomputer/together-py/releases/tag/v2.20.0 · raw={\"repo\":\"togethercomputer/together-py\"}"},{"ref":"E2","kind":"event","title":"Head of Hyperscaler Partnerships","date":"2026-06-26T18:19:02+00:00","date_source":"greenhouse.updated_at","source_url":"https://job-boards.greenhouse.io/togetherai/jobs/5171124007","signal_url":"https://onlylabs.fyi/signals/fe0eec71-e63c-43b1-8526-de6053d594d0","signal_json_url":"https://onlylabs.fyi/signals/fe0eec71-e63c-43b1-8526-de6053d594d0/signal.json","text":"job_opened · Head of Hyperscaler Partnerships · signal_desk=hiring · occurred_at=2026-06-26T18:19:02+00:00 · url=https://job-boards.greenhouse.io/togetherai/jobs/5171124007 · raw={\"location\":\"San Francisco\",\"ats\":\"greenhouse\"}"},{"ref":"E3","kind":"event","title":"togethercomputer/together-storage-claude-skills","date":"2026-06-26T13:20:39+00:00","date_source":"source","source_url":"https://github.com/togethercomputer/together-storage-claude-skills","signal_url":"https://onlylabs.fyi/signals/ba9fbb5a-2c26-4334-8833-7c0ba04dd0ce","signal_json_url":"https://onlylabs.fyi/signals/ba9fbb5a-2c26-4334-8833-7c0ba04dd0ce/signal.json","text":"repo_new · togethercomputer/together-storage-claude-skills · signal_desk=repos · occurred_at=2026-06-26T13:20:39+00:00 · url=https://github.com/togethercomputer/together-storage-claude-skills · raw={\"repo\":\"togethercomputer/together-storage-claude-skills\",\"description\":\"Claude Code skills for deploying & verifying Together T4 + CS3 over Rook-Ceph (sanitized runbooks)\",\"language\":\"Go\"}"},{"ref":"E4","kind":"event","title":"Software Engineer(Amsterdam)","date":"2026-06-25T08:59:38+00:00","date_source":"greenhouse.updated_at","source_url":"https://job-boards.greenhouse.io/togetherai/jobs/5169470007","signal_url":"https://onlylabs.fyi/signals/770892b7-eedc-4279-9ef5-a8f9c725ee53","signal_json_url":"https://onlylabs.fyi/signals/770892b7-eedc-4279-9ef5-a8f9c725ee53/signal.json","text":"job_opened · Software Engineer(Amsterdam) · signal_desk=hiring · occurred_at=2026-06-25T08:59:38+00:00 · url=https://job-boards.greenhouse.io/togetherai/jobs/5169470007 · raw={\"location\":\"Amsterdam\",\"ats\":\"greenhouse\"}"},{"ref":"E5","kind":"event","title":"togethercomputer/together-py v2.19.0","date":"2026-06-24T21:26:38+00:00","date_source":"source","source_url":"https://github.com/togethercomputer/together-py/releases/tag/v2.19.0","signal_url":"https://onlylabs.fyi/signals/1154857a-4654-40e9-8e4a-6867260444eb","signal_json_url":"https://onlylabs.fyi/signals/1154857a-4654-40e9-8e4a-6867260444eb/signal.json","text":"release · togethercomputer/together-py v2.19.0 · signal_desk=releases · occurred_at=2026-06-24T21:26:38+00:00 · url=https://github.com/togethercomputer/together-py/releases/tag/v2.19.0 · raw={\"repo\":\"togethercomputer/together-py\"}"},{"ref":"E6","kind":"event","title":"Product Manager, AI Infrastructure","date":"2026-06-24T21:25:41+00:00","date_source":"greenhouse.updated_at","source_url":"https://job-boards.greenhouse.io/togetherai/jobs/5172169007","signal_url":"https://onlylabs.fyi/signals/485ef22f-5017-4a25-b07e-09ea8d20bacf","signal_json_url":"https://onlylabs.fyi/signals/485ef22f-5017-4a25-b07e-09ea8d20bacf/signal.json","text":"job_opened · Product Manager, AI Infrastructure · signal_desk=hiring · occurred_at=2026-06-24T21:25:41+00:00 · url=https://job-boards.greenhouse.io/togetherai/jobs/5172169007 · raw={\"location\":\"San Francisco\",\"ats\":\"greenhouse\"}"},{"ref":"E7","kind":"event","title":"togethercomputer/together-py v2.18.0","date":"2026-06-24T16:02:39+00:00","date_source":"source","source_url":"https://github.com/togethercomputer/together-py/releases/tag/v2.18.0","signal_url":"https://onlylabs.fyi/signals/84bc1659-abc4-402d-91d7-7d5ab1121bae","signal_json_url":"https://onlylabs.fyi/signals/84bc1659-abc4-402d-91d7-7d5ab1121bae/signal.json","text":"release · togethercomputer/together-py v2.18.0 · signal_desk=releases · occurred_at=2026-06-24T16:02:39+00:00 · url=https://github.com/togethercomputer/together-py/releases/tag/v2.18.0 · raw={\"repo\":\"togethercomputer/together-py\"}"},{"ref":"E8","kind":"event","title":"Senior Software Engineer(Amsterdam)","date":"2026-06-24T09:37:43+00:00","date_source":"greenhouse.updated_at","source_url":"https://job-boards.greenhouse.io/togetherai/jobs/5162910007","signal_url":"https://onlylabs.fyi/signals/7e8c65b5-f295-4df6-9cdf-a8c5c44a34b3","signal_json_url":"https://onlylabs.fyi/signals/7e8c65b5-f295-4df6-9cdf-a8c5c44a34b3/signal.json","text":"job_opened · Senior Software Engineer(Amsterdam) · signal_desk=hiring · occurred_at=2026-06-24T09:37:43+00:00 · url=https://job-boards.greenhouse.io/togetherai/jobs/5162910007 · raw={\"location\":\"Amsterdam\",\"ats\":\"greenhouse\"}"},{"ref":"E9","kind":"event","title":"Research Intern, Model Shaping (Fall 2026)","date":"2026-06-23T19:42:26+00:00","date_source":"greenhouse.updated_at","source_url":"https://job-boards.greenhouse.io/togetherai/jobs/5157661007","signal_url":"https://onlylabs.fyi/signals/b7c0f3f4-4a33-4e4f-8166-b8a4d9762228","signal_json_url":"https://onlylabs.fyi/signals/b7c0f3f4-4a33-4e4f-8166-b8a4d9762228/signal.json","text":"job_opened · Research Intern, Model Shaping (Fall 2026) · signal_desk=hiring · occurred_at=2026-06-23T19:42:26+00:00 · url=https://job-boards.greenhouse.io/togetherai/jobs/5157661007 · raw={\"location\":\"San Francisco\",\"ats\":\"greenhouse\"}"},{"ref":"E10","kind":"event","title":"Research Intern, Inference (Fall 2026)","date":"2026-06-23T19:42:04+00:00","date_source":"greenhouse.updated_at","source_url":"https://job-boards.greenhouse.io/togetherai/jobs/5157759007","signal_url":"https://onlylabs.fyi/signals/76230eac-601d-4ea0-baa3-88d80295e6d1","signal_json_url":"https://onlylabs.fyi/signals/76230eac-601d-4ea0-baa3-88d80295e6d1/signal.json","text":"job_opened · Research Intern, Inference (Fall 2026) · signal_desk=hiring · occurred_at=2026-06-23T19:42:04+00:00 · url=https://job-boards.greenhouse.io/togetherai/jobs/5157759007 · raw={\"location\":\"San Francisco\",\"ats\":\"greenhouse\"}"},{"ref":"E11","kind":"event","title":"Frontier Agents Intern (Fall 2026)","date":"2026-06-23T19:41:37+00:00","date_source":"greenhouse.updated_at","source_url":"https://job-boards.greenhouse.io/togetherai/jobs/5157380007","signal_url":"https://onlylabs.fyi/signals/a49cb245-bf96-428b-a2c9-0d6042c3b733","signal_json_url":"https://onlylabs.fyi/signals/a49cb245-bf96-428b-a2c9-0d6042c3b733/signal.json","text":"job_opened · Frontier Agents Intern (Fall 2026) · signal_desk=hiring · occurred_at=2026-06-23T19:41:37+00:00 · url=https://job-boards.greenhouse.io/togetherai/jobs/5157380007 · raw={\"location\":\"San Francisco\",\"ats\":\"greenhouse\"}"},{"ref":"E12","kind":"event","title":"Platform Engineer, Model Shaping","date":"2026-06-23T14:54:34+00:00","date_source":"greenhouse.updated_at","source_url":"https://job-boards.greenhouse.io/togetherai/jobs/4790243007","signal_url":"https://onlylabs.fyi/signals/43c34485-d3ef-49dc-a5a9-49544ec26339","signal_json_url":"https://onlylabs.fyi/signals/43c34485-d3ef-49dc-a5a9-49544ec26339/signal.json","text":"job_opened · Platform Engineer, Model Shaping · signal_desk=hiring · occurred_at=2026-06-23T14:54:34+00:00 · url=https://job-boards.greenhouse.io/togetherai/jobs/4790243007 · raw={\"location\":\"San Francisco \",\"ats\":\"greenhouse\"}"},{"ref":"E13","kind":"event","title":"ParallelKernelBench: Frontier LLMs can't write fast multi-GPU kernels (yet)","date":"2026-06-23T00:00:00+00:00","date_source":"rss.item_date","source_url":"https://www.together.ai/blog/parallelkernelbench","signal_url":"https://onlylabs.fyi/signals/85c5ec42-d179-4a01-9ea9-2d182bc2737f","signal_json_url":"https://onlylabs.fyi/signals/85c5ec42-d179-4a01-9ea9-2d182bc2737f/signal.json","text":"post_published · ParallelKernelBench: Frontier LLMs can't write fast multi-GPU kernels (yet) · signal_desk=talking · occurred_at=2026-06-23T00:00:00+00:00 · url=https://www.together.ai/blog/parallelkernelbench · raw={\"excerpt\":\"ParallelKernelBench tests whether LLMs can write fast multi-GPU CUDA kernels across 87 real workloads. The best model solves under a third, but a few generated kernels beat any public implementation.\"}"},{"ref":"E14","kind":"event","title":"togethercomputer/together-py v2.17.0","date":"2026-06-22T21:06:51+00:00","date_source":"source","source_url":"https://github.com/togethercomputer/together-py/releases/tag/v2.17.0","signal_url":"https://onlylabs.fyi/signals/514c3eef-382f-42f8-979e-00cbf609c5bf","signal_json_url":"https://onlylabs.fyi/signals/514c3eef-382f-42f8-979e-00cbf609c5bf/signal.json","text":"release · togethercomputer/together-py v2.17.0 · signal_desk=releases · occurred_at=2026-06-22T21:06:51+00:00 · url=https://github.com/togethercomputer/together-py/releases/tag/v2.17.0 · raw={\"repo\":\"togethercomputer/together-py\"}"},{"ref":"E15","kind":"event","title":"Research Intern RL & Post-Training Systems, Turbo (Fall 2026)","date":"2026-06-22T16:30:52+00:00","date_source":"greenhouse.updated_at","source_url":"https://job-boards.greenhouse.io/togetherai/jobs/5168929007","signal_url":"https://onlylabs.fyi/signals/7d171fac-d278-46c1-8672-ad9660d6e5a1","signal_json_url":"https://onlylabs.fyi/signals/7d171fac-d278-46c1-8672-ad9660d6e5a1/signal.json","text":"job_opened · Research Intern RL & Post-Training Systems, Turbo (Fall 2026) · signal_desk=hiring · occurred_at=2026-06-22T16:30:52+00:00 · url=https://job-boards.greenhouse.io/togetherai/jobs/5168929007 · raw={\"location\":\"San Francisco\",\"ats\":\"greenhouse\"}"},{"ref":"E16","kind":"event","title":"Workplace Coordinator","date":"2026-06-19T18:04:07+00:00","date_source":"greenhouse.updated_at","source_url":"https://job-boards.greenhouse.io/togetherai/jobs/5166459007","signal_url":"https://onlylabs.fyi/signals/5c44f7d5-d1ec-4d6e-abae-3918d796a022","signal_json_url":"https://onlylabs.fyi/signals/5c44f7d5-d1ec-4d6e-abae-3918d796a022/signal.json","text":"job_opened · Workplace Coordinator · signal_desk=hiring · occurred_at=2026-06-19T18:04:07+00:00 · url=https://job-boards.greenhouse.io/togetherai/jobs/5166459007 · raw={\"location\":\"San Francisco\",\"ats\":\"greenhouse\"}"},{"ref":"E17","kind":"event","title":"Backend Engineer ","date":"2026-06-19T13:36:56+00:00","date_source":"greenhouse.updated_at","source_url":"https://job-boards.greenhouse.io/togetherai/jobs/5068767007","signal_url":"https://onlylabs.fyi/signals/d1c81c6e-fcbf-45a9-b3cf-a86cc70d87a4","signal_json_url":"https://onlylabs.fyi/signals/d1c81c6e-fcbf-45a9-b3cf-a86cc70d87a4/signal.json","text":"job_opened · Backend Engineer  · signal_desk=hiring · occurred_at=2026-06-19T13:36:56+00:00 · url=https://job-boards.greenhouse.io/togetherai/jobs/5068767007 · raw={\"location\":\"Amsterdam\",\"ats\":\"greenhouse\"}"},{"ref":"E18","kind":"event","title":"togethercomputer/together-sandbox together-sandbox-workspace-v3.0.0","date":"2026-06-18T09:18:31+00:00","date_source":"source","source_url":"https://github.com/togethercomputer/together-sandbox/releases/tag/together-sandbox-workspace-v3.0.0","signal_url":"https://onlylabs.fyi/signals/da28f026-5338-43d3-a74e-d2f597912eae","signal_json_url":"https://onlylabs.fyi/signals/da28f026-5338-43d3-a74e-d2f597912eae/signal.json","text":"release · togethercomputer/together-sandbox together-sandbox-workspace-v3.0.0 · signal_desk=releases · occurred_at=2026-06-18T09:18:31+00:00 · url=https://github.com/togethercomputer/together-sandbox/releases/tag/together-sandbox-workspace-v3.0.0 · raw={\"repo\":\"togethercomputer/together-sandbox\"}"},{"ref":"E19","kind":"event","title":"togethercomputer/together-typescript v0.41.2","date":"2026-06-17T16:39:52+00:00","date_source":"source","source_url":"https://github.com/togethercomputer/together-typescript/releases/tag/v0.41.2","signal_url":"https://onlylabs.fyi/signals/873e82a8-d5fe-4517-a3d7-88d6a92f2dea","signal_json_url":"https://onlylabs.fyi/signals/873e82a8-d5fe-4517-a3d7-88d6a92f2dea/signal.json","text":"release · togethercomputer/together-typescript v0.41.2 · signal_desk=releases · occurred_at=2026-06-17T16:39:52+00:00 · url=https://github.com/togethercomputer/together-typescript/releases/tag/v0.41.2 · raw={\"repo\":\"togethercomputer/together-typescript\"}"},{"ref":"E20","kind":"event","title":"Kimi K2.7 Code vs Claude Fable 5: Landing pages that cost 94% less","date":"2026-06-17T00:00:00+00:00","date_source":"rss.item_date","source_url":"https://www.together.ai/blog/kimi-k2-7-code-vs-claude-fable-5","signal_url":"https://onlylabs.fyi/signals/5a4fd1e2-4560-4992-8745-899df77ecad5","signal_json_url":"https://onlylabs.fyi/signals/5a4fd1e2-4560-4992-8745-899df77ecad5/signal.json","text":"post_published · Kimi K2.7 Code vs Claude Fable 5: Landing pages that cost 94% less · signal_desk=talking · occurred_at=2026-06-17T00:00:00+00:00 · url=https://www.together.ai/blog/kimi-k2-7-code-vs-claude-fable-5 · raw={\"excerpt\":\"We generated 12 landing pages with Kimi K2.7 Code and Claude Fable 5. Kimi cost 94% less and scored within a few points on every page. Here's what actually moved the needle.\"}"},{"ref":"E21","kind":"event","title":"Senior Technical Recruiter, AI/ML Research","date":"2026-06-16T21:30:01+00:00","date_source":"greenhouse.updated_at","source_url":"https://job-boards.greenhouse.io/togetherai/jobs/5135941007","signal_url":"https://onlylabs.fyi/signals/b1ff86c8-39c4-4964-8875-53dce4af32d7","signal_json_url":"https://onlylabs.fyi/signals/b1ff86c8-39c4-4964-8875-53dce4af32d7/signal.json","text":"job_opened · Senior Technical Recruiter, AI/ML Research · signal_desk=hiring · occurred_at=2026-06-16T21:30:01+00:00 · url=https://job-boards.greenhouse.io/togetherai/jobs/5135941007 · raw={\"location\":\"San Francisco\",\"ats\":\"greenhouse\"}"},{"ref":"E22","kind":"event","title":"AI Infrastructure Engineer","date":"2026-06-16T20:57:59+00:00","date_source":"greenhouse.updated_at","source_url":"https://job-boards.greenhouse.io/togetherai/jobs/5138540007","signal_url":"https://onlylabs.fyi/signals/66cc7094-4645-4c25-8b3c-b7d70e29ba17","signal_json_url":"https://onlylabs.fyi/signals/66cc7094-4645-4c25-8b3c-b7d70e29ba17/signal.json","text":"job_opened · AI Infrastructure Engineer · signal_desk=hiring · occurred_at=2026-06-16T20:57:59+00:00 · url=https://job-boards.greenhouse.io/togetherai/jobs/5138540007 · raw={\"location\":\"San Francisco\",\"ats\":\"greenhouse\"}"},{"ref":"E23","kind":"event","title":"Staff Engineer, Distributed Storage and HPC & AI Infrastructure","date":"2026-06-16T14:40:16+00:00","date_source":"greenhouse.updated_at","source_url":"https://job-boards.greenhouse.io/togetherai/jobs/5155722007","signal_url":"https://onlylabs.fyi/signals/57b9ecaf-5735-4acf-bcc8-20cb554e9f40","signal_json_url":"https://onlylabs.fyi/signals/57b9ecaf-5735-4acf-bcc8-20cb554e9f40/signal.json","text":"job_opened · Staff Engineer, Distributed Storage and HPC & AI Infrastructure · signal_desk=hiring · occurred_at=2026-06-16T14:40:16+00:00 · url=https://job-boards.greenhouse.io/togetherai/jobs/5155722007 · raw={\"location\":\"San Francisco\",\"ats\":\"greenhouse\"}"},{"ref":"E24","kind":"event","title":"togethercomputer/together-py v2.16.1","date":"2026-06-16T12:26:34+00:00","date_source":"source","source_url":"https://github.com/togethercomputer/together-py/releases/tag/v2.16.1","signal_url":"https://onlylabs.fyi/signals/e8aa3de7-2a0a-4088-8b89-4dc5369744cc","signal_json_url":"https://onlylabs.fyi/signals/e8aa3de7-2a0a-4088-8b89-4dc5369744cc/signal.json","text":"release · togethercomputer/together-py v2.16.1 · signal_desk=releases · occurred_at=2026-06-16T12:26:34+00:00 · url=https://github.com/togethercomputer/together-py/releases/tag/v2.16.1 · raw={\"repo\":\"togethercomputer/together-py\"}"},{"ref":"E25","kind":"event","title":"Finance Analytics Engineer ","date":"2026-06-15T20:19:03+00:00","date_source":"greenhouse.updated_at","source_url":"https://job-boards.greenhouse.io/togetherai/jobs/5108385007","signal_url":"https://onlylabs.fyi/signals/975e704e-00b7-4477-a9d8-50dab7510888","signal_json_url":"https://onlylabs.fyi/signals/975e704e-00b7-4477-a9d8-50dab7510888/signal.json","text":"job_opened · Finance Analytics Engineer  · signal_desk=hiring · occurred_at=2026-06-15T20:19:03+00:00 · url=https://job-boards.greenhouse.io/togetherai/jobs/5108385007 · raw={\"location\":\"San Francisco\",\"ats\":\"greenhouse\"}"},{"ref":"E26","kind":"event","title":"Infrastructure Accounting Manager","date":"2026-06-15T20:19:03+00:00","date_source":"greenhouse.updated_at","source_url":"https://job-boards.greenhouse.io/togetherai/jobs/5134279007","signal_url":"https://onlylabs.fyi/signals/0cedd640-2f57-4fec-89ab-6495493e26b3","signal_json_url":"https://onlylabs.fyi/signals/0cedd640-2f57-4fec-89ab-6495493e26b3/signal.json","text":"job_opened · Infrastructure Accounting Manager · signal_desk=hiring · occurred_at=2026-06-15T20:19:03+00:00 · url=https://job-boards.greenhouse.io/togetherai/jobs/5134279007 · raw={\"location\":\"San Francisco\",\"ats\":\"greenhouse\"}"},{"ref":"E27","kind":"event","title":"Machine Learning Engineer - Inference","date":"2026-06-15T20:19:03+00:00","date_source":"greenhouse.updated_at","source_url":"https://job-boards.greenhouse.io/togetherai/jobs/4385540007","signal_url":"https://onlylabs.fyi/signals/5f533fc7-13b6-4ec4-ac79-8cd71064dd50","signal_json_url":"https://onlylabs.fyi/signals/5f533fc7-13b6-4ec4-ac79-8cd71064dd50/signal.json","text":"job_opened · Machine Learning Engineer - Inference · signal_desk=hiring · occurred_at=2026-06-15T20:19:03+00:00 · url=https://job-boards.greenhouse.io/togetherai/jobs/4385540007 · raw={\"location\":\"San Francisco \",\"ats\":\"greenhouse\"}"},{"ref":"E28","kind":"event","title":"Research Engineer, Frontier Speculative Decoding","date":"2026-06-15T20:19:03+00:00","date_source":"greenhouse.updated_at","source_url":"https://job-boards.greenhouse.io/togetherai/jobs/4987660007","signal_url":"https://onlylabs.fyi/signals/a8aa3cfd-007b-4c6a-af86-1888ca87f29c","signal_json_url":"https://onlylabs.fyi/signals/a8aa3cfd-007b-4c6a-af86-1888ca87f29c/signal.json","text":"job_opened · Research Engineer, Frontier Speculative Decoding · signal_desk=hiring · occurred_at=2026-06-15T20:19:03+00:00 · url=https://job-boards.greenhouse.io/togetherai/jobs/4987660007 · raw={\"location\":\"San Francisco, New York City \",\"ats\":\"greenhouse\"}"},{"ref":"E29","kind":"event","title":"Systems Research Engineer Intern - GPU Programming (Fall 2026)","date":"2026-06-15T20:19:03+00:00","date_source":"greenhouse.updated_at","source_url":"https://job-boards.greenhouse.io/togetherai/jobs/5157559007","signal_url":"https://onlylabs.fyi/signals/4b3da85f-9dc2-4d72-b9f9-995f60076923","signal_json_url":"https://onlylabs.fyi/signals/4b3da85f-9dc2-4d72-b9f9-995f60076923/signal.json","text":"job_opened · Systems Research Engineer Intern - GPU Programming (Fall 2026) · signal_desk=hiring · occurred_at=2026-06-15T20:19:03+00:00 · url=https://job-boards.greenhouse.io/togetherai/jobs/5157559007 · raw={\"location\":\"San Francisco\",\"ats\":\"greenhouse\"}"},{"ref":"E30","kind":"event","title":"Senior Software Engineer - Together Cloud Infrastructure","date":"2026-06-15T20:19:03+00:00","date_source":"greenhouse.updated_at","source_url":"https://job-boards.greenhouse.io/togetherai/jobs/4749787007","signal_url":"https://onlylabs.fyi/signals/26c892c2-17cc-4135-9bd7-d7148f0d0696","signal_json_url":"https://onlylabs.fyi/signals/26c892c2-17cc-4135-9bd7-d7148f0d0696/signal.json","text":"job_opened · Senior Software Engineer - Together Cloud Infrastructure · signal_desk=hiring · occurred_at=2026-06-15T20:19:03+00:00 · url=https://job-boards.greenhouse.io/togetherai/jobs/4749787007 · raw={\"location\":\"San Francisco\",\"ats\":\"greenhouse\"}"},{"ref":"E31","kind":"event","title":"Systems Research Engineer, GPU Programming","date":"2026-06-15T20:19:03+00:00","date_source":"greenhouse.updated_at","source_url":"https://job-boards.greenhouse.io/togetherai/jobs/4188119007","signal_url":"https://onlylabs.fyi/signals/9e23d7ba-52b0-47dc-b060-6eefb99cc669","signal_json_url":"https://onlylabs.fyi/signals/9e23d7ba-52b0-47dc-b060-6eefb99cc669/signal.json","text":"job_opened · Systems Research Engineer, GPU Programming · signal_desk=hiring · occurred_at=2026-06-15T20:19:03+00:00 · url=https://job-boards.greenhouse.io/togetherai/jobs/4188119007 · raw={\"location\":\"San Francisco\",\"ats\":\"greenhouse\"}"},{"ref":"E32","kind":"event","title":"Senior Software Engineer Together Cloud Infrastructure ","date":"2026-06-15T20:19:03+00:00","date_source":"greenhouse.updated_at","source_url":"https://job-boards.greenhouse.io/togetherai/jobs/5028862007","signal_url":"https://onlylabs.fyi/signals/70922365-2270-424e-b8c5-3ef9d42e86ed","signal_json_url":"https://onlylabs.fyi/signals/70922365-2270-424e-b8c5-3ef9d42e86ed/signal.json","text":"job_opened · Senior Software Engineer Together Cloud Infrastructure  · signal_desk=hiring · occurred_at=2026-06-15T20:19:03+00:00 · url=https://job-boards.greenhouse.io/togetherai/jobs/5028862007 · raw={\"location\":\"Amsterdam\",\"ats\":\"greenhouse\"}"},{"ref":"E33","kind":"event","title":"Analytics Engineer — Data Warehouse","date":"2026-06-15T20:19:03+00:00","date_source":"greenhouse.updated_at","source_url":"https://job-boards.greenhouse.io/togetherai/jobs/5101651007","signal_url":"https://onlylabs.fyi/signals/ab917d66-fdb1-45ed-ace0-7c7a7333a24f","signal_json_url":"https://onlylabs.fyi/signals/ab917d66-fdb1-45ed-ace0-7c7a7333a24f/signal.json","text":"job_opened · Analytics Engineer — Data Warehouse · signal_desk=hiring · occurred_at=2026-06-15T20:19:03+00:00 · url=https://job-boards.greenhouse.io/togetherai/jobs/5101651007 · raw={\"location\":\"San Francisco\",\"ats\":\"greenhouse\"}"},{"ref":"E34","kind":"event","title":"Forward Deployed Engineer (Inference & Post-Training)","date":"2026-06-15T20:19:03+00:00","date_source":"greenhouse.updated_at","source_url":"https://job-boards.greenhouse.io/togetherai/jobs/5131941007","signal_url":"https://onlylabs.fyi/signals/fe5f52fd-3970-4c7b-91d4-ed57cea6c88e","signal_json_url":"https://onlylabs.fyi/signals/fe5f52fd-3970-4c7b-91d4-ed57cea6c88e/signal.json","text":"job_opened · Forward Deployed Engineer (Inference & Post-Training) · signal_desk=hiring · occurred_at=2026-06-15T20:19:03+00:00 · url=https://job-boards.greenhouse.io/togetherai/jobs/5131941007 · raw={\"location\":\"San Francisco\",\"ats\":\"greenhouse\"}"},{"ref":"E35","kind":"event","title":"Infrastructure Design Engineer","date":"2026-06-15T20:19:03+00:00","date_source":"greenhouse.updated_at","source_url":"https://job-boards.greenhouse.io/togetherai/jobs/5135876007","signal_url":"https://onlylabs.fyi/signals/79bc42b2-57aa-442a-87ff-df65bf1d0ee0","signal_json_url":"https://onlylabs.fyi/signals/79bc42b2-57aa-442a-87ff-df65bf1d0ee0/signal.json","text":"job_opened · Infrastructure Design Engineer · signal_desk=hiring · occurred_at=2026-06-15T20:19:03+00:00 · url=https://job-boards.greenhouse.io/togetherai/jobs/5135876007 · raw={\"location\":\"San Francisco\",\"ats\":\"greenhouse\"}"},{"ref":"E36","kind":"event","title":"AI Researcher, Core ML (Turbo)","date":"2026-06-15T20:19:03+00:00","date_source":"greenhouse.updated_at","source_url":"https://job-boards.greenhouse.io/togetherai/jobs/4187681007","signal_url":"https://onlylabs.fyi/signals/d0de6088-273e-4e81-ae0a-2c0cc6368636","signal_json_url":"https://onlylabs.fyi/signals/d0de6088-273e-4e81-ae0a-2c0cc6368636/signal.json","text":"job_opened · AI Researcher, Core ML (Turbo) · signal_desk=hiring · occurred_at=2026-06-15T20:19:03+00:00 · url=https://job-boards.greenhouse.io/togetherai/jobs/4187681007 · raw={\"location\":\"San Francisco\",\"ats\":\"greenhouse\"}"},{"ref":"E37","kind":"event","title":"Backend Software Engineer — Data Platform & AI Data Products","date":"2026-06-15T20:19:03+00:00","date_source":"greenhouse.updated_at","source_url":"https://job-boards.greenhouse.io/togetherai/jobs/5064263007","signal_url":"https://onlylabs.fyi/signals/f10a5e22-9ef2-4e33-a217-b4dbd6612def","signal_json_url":"https://onlylabs.fyi/signals/f10a5e22-9ef2-4e33-a217-b4dbd6612def/signal.json","text":"job_opened · Backend Software Engineer — Data Platform & AI Data Products · signal_desk=hiring · occurred_at=2026-06-15T20:19:03+00:00 · url=https://job-boards.greenhouse.io/togetherai/jobs/5064263007 · raw={\"location\":\"San Francisco\",\"ats\":\"greenhouse\"}"},{"ref":"E38","kind":"event","title":"Customer Support Engineer (Inference)","date":"2026-06-15T20:19:03+00:00","date_source":"greenhouse.updated_at","source_url":"https://job-boards.greenhouse.io/togetherai/jobs/5147747007","signal_url":"https://onlylabs.fyi/signals/1a44a1f8-276f-4c6c-b599-3c97553fbaf0","signal_json_url":"https://onlylabs.fyi/signals/1a44a1f8-276f-4c6c-b599-3c97553fbaf0/signal.json","text":"job_opened · Customer Support Engineer (Inference) · signal_desk=hiring · occurred_at=2026-06-15T20:19:03+00:00 · url=https://job-boards.greenhouse.io/togetherai/jobs/5147747007 · raw={\"location\":\"San Francisco, CA\",\"ats\":\"greenhouse\"}"},{"ref":"E39","kind":"event","title":"Lead Product Designer","date":"2026-06-15T20:19:03+00:00","date_source":"greenhouse.updated_at","source_url":"https://job-boards.greenhouse.io/togetherai/jobs/5062829007","signal_url":"https://onlylabs.fyi/signals/449a50aa-9650-4a7b-a647-8c1b2da48c5d","signal_json_url":"https://onlylabs.fyi/signals/449a50aa-9650-4a7b-a647-8c1b2da48c5d/signal.json","text":"job_opened · Lead Product Designer · signal_desk=hiring · occurred_at=2026-06-15T20:19:03+00:00 · url=https://job-boards.greenhouse.io/togetherai/jobs/5062829007 · raw={\"location\":\"San Francisco\",\"ats\":\"greenhouse\"}"},{"ref":"E40","kind":"event","title":"Customer Support Engineer (GPU Cluster)","date":"2026-06-15T20:19:03+00:00","date_source":"greenhouse.updated_at","source_url":"https://job-boards.greenhouse.io/togetherai/jobs/5093510007","signal_url":"https://onlylabs.fyi/signals/27e661d2-0aa9-4e34-87a2-27ad94be6490","signal_json_url":"https://onlylabs.fyi/signals/27e661d2-0aa9-4e34-87a2-27ad94be6490/signal.json","text":"job_opened · Customer Support Engineer (GPU Cluster) · signal_desk=hiring · occurred_at=2026-06-15T20:19:03+00:00 · url=https://job-boards.greenhouse.io/togetherai/jobs/5093510007 · raw={\"location\":\"San Francisco\",\"ats\":\"greenhouse\"}"},{"ref":"E41","kind":"event","title":"Staff Engineer, Product UI Platform","date":"2026-06-15T20:19:03+00:00","date_source":"greenhouse.updated_at","source_url":"https://job-boards.greenhouse.io/togetherai/jobs/5074088007","signal_url":"https://onlylabs.fyi/signals/55781bf9-3881-40be-9360-d73a072d169a","signal_json_url":"https://onlylabs.fyi/signals/55781bf9-3881-40be-9360-d73a072d169a/signal.json","text":"job_opened · Staff Engineer, Product UI Platform · signal_desk=hiring · occurred_at=2026-06-15T20:19:03+00:00 · url=https://job-boards.greenhouse.io/togetherai/jobs/5074088007 · raw={\"location\":\"San Francisco\",\"ats\":\"greenhouse\"}"},{"ref":"E42","kind":"event","title":"Serving MiniMax-M3 for efficient inference: Unlocking 1M-Token Context and Multimodality Without Regrets ","date":"2026-06-02T00:00:00+00:00","date_source":"rss.item_date","source_url":"https://www.together.ai/blog/serving-minimax-m3-for-efficient-inference-unlocking-1m-token-context-and-multimodality-without-regrets","signal_url":"https://onlylabs.fyi/signals/33644a67-d468-44ed-8255-6990f9054eec","signal_json_url":"https://onlylabs.fyi/signals/33644a67-d468-44ed-8255-6990f9054eec/signal.json","text":"post_published · Serving MiniMax-M3 for efficient inference: Unlocking 1M-Token Context and Multimodality Without Regrets  · signal_desk=talking · occurred_at=2026-06-02T00:00:00+00:00 · url=https://www.together.ai/blog/serving-minimax-m3-for-efficient-inference-unlocking-1m-token-context-and-multimodality-without-regrets · hn=1 points/0 comments · raw={\"excerpt\":\"How Together served MiniMax-M3 efficiently with KV-block-major sparse attention, paged MSA decode, optimized index scoring, and a Rust-based multimodal gateway.\"}"},{"ref":"E43","kind":"event","title":"togethercomputer/InferenceX","date":"2026-06-11T20:56:28+00:00","date_source":"source","source_url":"https://github.com/togethercomputer/InferenceX","signal_url":"https://onlylabs.fyi/signals/776880b0-159f-4240-842c-09efa3c1f0c6","signal_json_url":"https://onlylabs.fyi/signals/776880b0-159f-4240-842c-09efa3c1f0c6/signal.json","text":"repo_forked · togethercomputer/InferenceX · signal_desk=forks · occurred_at=2026-06-11T20:56:28+00:00 · url=https://github.com/togethercomputer/InferenceX · raw={\"repo\":\"togethercomputer/InferenceX\",\"parent\":\"SemiAnalysisAI/InferenceX\"}"},{"ref":"E44","kind":"event","title":"Building trust in enterprise AI: Together AI earns ISO 27001:2022 certification","date":"2026-06-10T00:00:00+00:00","date_source":"rss.item_date","source_url":"https://www.together.ai/blog/iso-27001-2022-certification","signal_url":"https://onlylabs.fyi/signals/9294f377-1f3d-4b21-8078-53ecff3e7406","signal_json_url":"https://onlylabs.fyi/signals/9294f377-1f3d-4b21-8078-53ecff3e7406/signal.json","text":"post_published · Building trust in enterprise AI: Together AI earns ISO 27001:2022 certification · signal_desk=talking · occurred_at=2026-06-10T00:00:00+00:00 · url=https://www.together.ai/blog/iso-27001-2022-certification · raw={\"excerpt\":\"Together AI has earned ISO 27001:2022 certification, validating our commitment to enterprise-grade security for production AI workloads.\"}"},{"ref":"E45","kind":"event","title":"togethercomputer/together-sandbox together-sandbox-workspace-v2.0.0","date":"2026-06-02T08:57:34+00:00","date_source":"source","source_url":"https://github.com/togethercomputer/together-sandbox/releases/tag/together-sandbox-workspace-v2.0.0","signal_url":"https://onlylabs.fyi/signals/8febc70f-8c38-45b4-b078-4163db722996","signal_json_url":"https://onlylabs.fyi/signals/8febc70f-8c38-45b4-b078-4163db722996/signal.json","text":"release · togethercomputer/together-sandbox together-sandbox-workspace-v2.0.0 · signal_desk=releases · occurred_at=2026-06-02T08:57:34+00:00 · url=https://github.com/togethercomputer/together-sandbox/releases/tag/together-sandbox-workspace-v2.0.0 · raw={\"repo\":\"togethercomputer/together-sandbox\"}"},{"ref":"E46","kind":"event","title":"togethercomputer/tinker-cookbook","date":"2026-06-01T08:29:09+00:00","date_source":"source","source_url":"https://github.com/togethercomputer/tinker-cookbook","signal_url":"https://onlylabs.fyi/signals/6367550f-bd25-4483-8fda-eba8b039b892","signal_json_url":"https://onlylabs.fyi/signals/6367550f-bd25-4483-8fda-eba8b039b892/signal.json","text":"repo_forked · togethercomputer/tinker-cookbook · signal_desk=forks · occurred_at=2026-06-01T08:29:09+00:00 · url=https://github.com/togethercomputer/tinker-cookbook · raw={\"repo\":\"togethercomputer/tinker-cookbook\",\"parent\":\"thinking-machines-lab/tinker-cookbook\"}"},{"ref":"E47","kind":"event","title":"togethercomputer/xorl-wheels tilelang_0.1.10_cu131","date":"2026-05-31T20:55:20+00:00","date_source":"source","source_url":"https://github.com/togethercomputer/xorl-wheels/releases/tag/tilelang_0.1.10_cu131","signal_url":"https://onlylabs.fyi/signals/688f8bcd-26e2-4d14-89f4-40bd8877ea2a","signal_json_url":"https://onlylabs.fyi/signals/688f8bcd-26e2-4d14-89f4-40bd8877ea2a/signal.json","text":"release · togethercomputer/xorl-wheels tilelang_0.1.10_cu131 · signal_desk=releases · occurred_at=2026-05-31T20:55:20+00:00 · url=https://github.com/togethercomputer/xorl-wheels/releases/tag/tilelang_0.1.10_cu131 · raw={\"repo\":\"togethercomputer/xorl-wheels\"}"},{"ref":"E48","kind":"event","title":"togethercomputer/detect_agent detect_agent-v0.3.0","date":"2026-05-29T19:08:51+00:00","date_source":"source","source_url":"https://github.com/togethercomputer/detect_agent/releases/tag/detect_agent-v0.3.0","signal_url":"https://onlylabs.fyi/signals/cd8b0818-971f-4eba-8365-f61bb100577b","signal_json_url":"https://onlylabs.fyi/signals/cd8b0818-971f-4eba-8365-f61bb100577b/signal.json","text":"release · togethercomputer/detect_agent detect_agent-v0.3.0 · signal_desk=releases · occurred_at=2026-05-29T19:08:51+00:00 · url=https://github.com/togethercomputer/detect_agent/releases/tag/detect_agent-v0.3.0 · raw={\"repo\":\"togethercomputer/detect_agent\"}"},{"ref":"E49","kind":"event","title":"togethercomputer/together-sandbox together-sandbox-workspace-v1.12.0","date":"2026-05-29T08:38:53+00:00","date_source":"source","source_url":"https://github.com/togethercomputer/together-sandbox/releases/tag/together-sandbox-workspace-v1.12.0","signal_url":"https://onlylabs.fyi/signals/d1eac417-81a5-4ad3-a773-2d5afb577f5a","signal_json_url":"https://onlylabs.fyi/signals/d1eac417-81a5-4ad3-a773-2d5afb577f5a/signal.json","text":"release · togethercomputer/together-sandbox together-sandbox-workspace-v1.12.0 · signal_desk=releases · occurred_at=2026-05-29T08:38:53+00:00 · url=https://github.com/togethercomputer/together-sandbox/releases/tag/together-sandbox-workspace-v1.12.0 · raw={\"repo\":\"togethercomputer/together-sandbox\"}"},{"ref":"E50","kind":"event","title":"How Together AI built the world’s fastest speech-to-text stack","date":"2026-05-29T00:00:00+00:00","date_source":"rss.item_date","source_url":"https://www.together.ai/blog/how-together-ai-built-the-worlds-fastest-speech-to-text-stack","signal_url":"https://onlylabs.fyi/signals/56ba412f-f785-4495-a0c4-bec800f64fd3","signal_json_url":"https://onlylabs.fyi/signals/56ba412f-f785-4495-a0c4-bec800f64fd3/signal.json","text":"post_published · How Together AI built the world’s fastest speech-to-text stack · signal_desk=talking · occurred_at=2026-05-29T00:00:00+00:00 · url=https://www.together.ai/blog/how-together-ai-built-the-worlds-fastest-speech-to-text-stack · raw={\"excerpt\":\"Together AI built the fastest speech-to-text stack on Artificial Analysis by treating ASR as a full-path systems problem, not just a GPU inference problem.\"}"},{"ref":"E51","kind":"event","title":"togethercomputer/together-sandbox together-sandbox-workspace-v1.11.0","date":"2026-05-28T09:01:08+00:00","date_source":"source","source_url":"https://github.com/togethercomputer/together-sandbox/releases/tag/together-sandbox-workspace-v1.11.0","signal_url":"https://onlylabs.fyi/signals/b50366e3-3cfd-44f0-8336-a160cf3ec0e2","signal_json_url":"https://onlylabs.fyi/signals/b50366e3-3cfd-44f0-8336-a160cf3ec0e2/signal.json","text":"release · togethercomputer/together-sandbox together-sandbox-workspace-v1.11.0 · signal_desk=releases · occurred_at=2026-05-28T09:01:08+00:00 · url=https://github.com/togethercomputer/together-sandbox/releases/tag/together-sandbox-workspace-v1.11.0 · raw={\"repo\":\"togethercomputer/together-sandbox\"}"},{"ref":"E52","kind":"event","title":"Benchmarking inference at scale: coding agents","date":"2026-05-19T00:00:00+00:00","date_source":"rss.item_date","source_url":"https://www.together.ai/blog/coding-agent-benchmarks","signal_url":"https://onlylabs.fyi/signals/3c08a1c0-235e-42b0-b347-d52e39d12ee1","signal_json_url":"https://onlylabs.fyi/signals/3c08a1c0-235e-42b0-b347-d52e39d12ee1/signal.json","text":"post_published · Benchmarking inference at scale: coding agents · signal_desk=talking · occurred_at=2026-05-19T00:00:00+00:00 · url=https://www.together.ai/blog/coding-agent-benchmarks · raw={\"excerpt\":\"Real-world inference benchmarks for coding agents: 31% more TPS than TensorRT-LLM, 2× better TTFT at saturation, and 76% lower cost than Claude Opus 4.6.\"}"},{"ref":"E53","kind":"event","title":"Together AI and Pearl Research Labs Team Up to Reduce the Cost of AI Inference","date":"2026-05-15T00:00:00+00:00","date_source":"rss.item_date","source_url":"https://www.together.ai/blog/together-ai-partners-with-pearl-research-labs","signal_url":"https://onlylabs.fyi/signals/49734867-446a-4524-963f-4812d706b5eb","signal_json_url":"https://onlylabs.fyi/signals/49734867-446a-4524-963f-4812d706b5eb/signal.json","text":"post_published · Together AI and Pearl Research Labs Team Up to Reduce the Cost of AI Inference · signal_desk=talking · occurred_at=2026-05-15T00:00:00+00:00 · url=https://www.together.ai/blog/together-ai-partners-with-pearl-research-labs · raw={\"excerpt\":\"Together AI partners with Pearl Research Labs to launch a discounted Pearl-powered inference endpoint for Gemma-4-31B-it-pearl, using Proof of Useful Work to turn AI workloads into crypto emissions.\"}"},{"ref":"E54","kind":"event","title":"Violin: An open-source video translation skill that breaks language barriers","date":"2026-05-14T00:00:00+00:00","date_source":"rss.item_date","source_url":"https://www.together.ai/blog/violin-open-source-translation-skill","signal_url":"https://onlylabs.fyi/signals/558e6d06-9f96-454a-a3bf-e34988a0e832","signal_json_url":"https://onlylabs.fyi/signals/558e6d06-9f96-454a-a3bf-e34988a0e832/signal.json","text":"post_published · Violin: An open-source video translation skill that breaks language barriers · signal_desk=talking · occurred_at=2026-05-14T00:00:00+00:00 · url=https://www.together.ai/blog/violin-open-source-translation-skill · raw={\"excerpt\":\"Violin is an open-source AI video translation tool that combines speech recognition, LLM translation, and text-to-speech to make video content accessible across languages.\"}"},{"ref":"E55","kind":"event","title":"Introducing voice finder — a new tool to quickly find the right voice for your app from over 600+ voices ","date":"2026-05-12T00:00:00+00:00","date_source":"rss.item_date","source_url":"https://www.together.ai/blog/introducing-voice-finder-a-new-tool-to-quickly-find-the-right-voice-for-your-app-from-over-600-voices","signal_url":"https://onlylabs.fyi/signals/eb4dd7b9-04a8-47e9-afa1-ca27b235f938","signal_json_url":"https://onlylabs.fyi/signals/eb4dd7b9-04a8-47e9-afa1-ca27b235f938/signal.json","text":"post_published · Introducing voice finder — a new tool to quickly find the right voice for your app from over 600+ voices  · signal_desk=talking · occurred_at=2026-05-12T00:00:00+00:00 · url=https://www.together.ai/blog/introducing-voice-finder-a-new-tool-to-quickly-find-the-right-voice-for-your-app-from-over-600-voices · raw={\"excerpt\":\"Voice finder helps developers search, match, filter, and audition 600+ voices across Together AI TTS models using natural-language prompts or uploaded audio samples.\"}"},{"ref":"E56","kind":"event","title":"Serving DeepSeek-V4: why million-token context is an inference systems problem","date":"2026-05-11T00:00:00+00:00","date_source":"rss.item_date","source_url":"https://www.together.ai/blog/serving-deepseek-v4-why-million-token-context-is-an-inference-systems-problem","signal_url":"https://onlylabs.fyi/signals/acc3bbe8-7204-4369-9fab-77561527ceef","signal_json_url":"https://onlylabs.fyi/signals/acc3bbe8-7204-4369-9fab-77561527ceef/signal.json","text":"post_published · Serving DeepSeek-V4: why million-token context is an inference systems problem · signal_desk=talking · occurred_at=2026-05-11T00:00:00+00:00 · url=https://www.together.ai/blog/serving-deepseek-v4-why-million-token-context-is-an-inference-systems-problem · raw={\"excerpt\":\"DeepSeek-V4 makes million-token context a serving-systems problem. Together AI explores the inference work behind V4 on NVIDIA HGX B200, including compressed KV layouts, prefix caching, kernel maturity, and endpoint profiles for long-context workloads.\"}"},{"ref":"E57","kind":"event","title":"Deploy and inference any model from HuggingFace","date":"2026-05-08T00:00:00+00:00","date_source":"rss.item_date","source_url":"https://www.together.ai/blog/deploy-and-inference-any-model-from-huggingface","signal_url":"https://onlylabs.fyi/signals/bf48f114-a6a8-4fc5-a087-2bc7d861230d","signal_json_url":"https://onlylabs.fyi/signals/bf48f114-a6a8-4fc5-a087-2bc7d861230d/signal.json","text":"post_published · Deploy and inference any model from HuggingFace · signal_desk=talking · occurred_at=2026-05-08T00:00:00+00:00 · url=https://www.together.ai/blog/deploy-and-inference-any-model-from-huggingface · raw={\"excerpt\":\"Learn how to deploy any Hugging Face model in one session using Goose and Together's Dedicated Container Inference. Skip the setup complexity — one prompt gets your model running in a production-grade GPU environment on release day.\"}"},{"ref":"E58","kind":"event","title":"Parcae: Doing more with fewer parameters using stable looped models","date":"2026-04-15T00:00:00+00:00","date_source":"rss.item_date","source_url":"https://www.together.ai/blog/parcae","signal_url":"https://onlylabs.fyi/signals/4b6752d0-efd3-40d9-a892-c03fc06f5133","signal_json_url":"https://onlylabs.fyi/signals/4b6752d0-efd3-40d9-a892-c03fc06f5133/signal.json","text":"post_published · Parcae: Doing more with fewer parameters using stable looped models · signal_desk=talking · occurred_at=2026-04-15T00:00:00+00:00 · url=https://www.together.ai/blog/parcae · hn=2 points/0 comments · raw={\"excerpt\":\"Parcae is a stable looped language model that matches the quality of a Transformer twice its size — a 770M model reaching 1.3B-level performance. We introduce the first scaling laws for looping and show that increasing recurrence, not just data, is a compute-efficient path to bet\"}"},{"ref":"E59","kind":"event","title":"togethercomputer/k8s-netperf","date":"2026-04-27T23:08:39+00:00","date_source":"source","source_url":"https://github.com/togethercomputer/k8s-netperf","signal_url":"https://onlylabs.fyi/signals/e6175547-cf8b-4774-9c10-ad0876c3f14e","signal_json_url":"https://onlylabs.fyi/signals/e6175547-cf8b-4774-9c10-ad0876c3f14e/signal.json","text":"repo_forked · togethercomputer/k8s-netperf · signal_desk=forks · occurred_at=2026-04-27T23:08:39+00:00 · url=https://github.com/togethercomputer/k8s-netperf · raw={\"repo\":\"togethercomputer/k8s-netperf\",\"parent\":\"cloud-bulldozer/k8s-netperf\"}"},{"ref":"E60","kind":"event","title":"togethercomputer/DeepGEMM","date":"2026-04-24T20:40:07+00:00","date_source":"source","source_url":"https://github.com/togethercomputer/DeepGEMM","signal_url":"https://onlylabs.fyi/signals/e1ad880b-e8c2-4eed-96b9-9286706c6932","signal_json_url":"https://onlylabs.fyi/signals/e1ad880b-e8c2-4eed-96b9-9286706c6932/signal.json","text":"repo_forked · togethercomputer/DeepGEMM · signal_desk=forks · occurred_at=2026-04-24T20:40:07+00:00 · url=https://github.com/togethercomputer/DeepGEMM · raw={\"repo\":\"togethercomputer/DeepGEMM\",\"parent\":\"deepseek-ai/DeepGEMM\"}"}]}