WritingTogether AITogether AIpublished Mar 18, 2026seen 5d

Together AI expands fine-tuning service with tool calling, reasoning, and vision support

Open original ↗

Captured source

source ↗

Together AI expands fine-tuning service with tool calling, reasoning, and vision support

⚡️ FlashAttention-4: up to 1.3× faster than cuDNN on NVIDIA Blackwell →

Introducing Together AI's new look →

🔎 ATLAS: runtime-learning accelerators delivering up to 4x faster LLM inference →

⚡ Together GPU Clusters: self-service NVIDIA GPUs, now generally available →

📦 Batch Inference API: Process billions of tokens at 50% lower cost for most models →

🪛 Fine-Tuning Platform Upgrades: Larger Models, Longer Contexts →

All blog posts

Fine-Tuning

Published 3/18/2026

Together AI expands fine-tuning service with tool calling, reasoning, and vision support

Authors

Conner Manuel, Nikita Smetanin, Artem Chumachenko, Ruslan Khaidurov, Maksim Abraham, Sergei Vorobyov, Gleb Khaykin, Soroush Bassam, Gleb Vazhenin, Denis Fedorenko, Zain Hasan, Sonny Khan, Will Van Eaton, Jennifer Wu, Max Ryabinin

Table of contents

40+ Models Chosen for Production...40+ Models Chosen for Production...40+ Models Chosen for Production...

Links in this article

VLM fine-tuning R easoning fine-tuning F unction calling fine-tuning Documentation Contact Sales

What’s New

Tool call fine-tuning: Ensure agents execute structured actions reliably with end-to-end fine-tuning and inference on OpenAI-compatible schema. Reasoning fine-tuning: Specialized support for training models on “thinking” tokens in reasoning traces, allowing models to learn complex logic. Vision-language model fine-tuning: Native support for vision training to align vision-language models with complex, domain-specific visual data. ‍ Large model support: train the latest models with up to 1T parameters on our highly optimized and easy-to-use service.

As AI teams move from single-turn prompting to advanced multi-turn workflows, reliability breaks in predictable places: tool calls that don’t match schemas, reasoning that degrades over long interactions, and models that miss domain-specific visual signals. Fixing those issues usually requires post-training, but the workflow is often fragmented, slow to iterate, and hard to plan. Today, Together AI, the AI Native Cloud, is expanding Together Fine-Tuning with native support for tool call, reasoning, and vision-language model (VLM) fine-tuning. To support frontier-scale post-training, we have also upgraded the training stack to handle 100B+ parameter models more efficiently, delivering up to 6× higher throughput. In addition, we now support fine-tuning on datasets of up to 100GB in size. Finally, we now provide job cost estimations before training and ETA during training, so teams can better plan their experiments. "Together AI does for fine-tuning and inference what Vercel does for LLM-based apps — it removes the infrastructure layer so we can focus on our product. We fine‑tune and deploy customer‑specific models through simple API calls. That lets our existing team move from weekly to daily iteration, cut costs by 2–3×, and improve accuracy from 77% to 87%." — Lamara De Brouwer, Co-Founder & CTO, XY.AI Labs Tool Call Fine-tuning Tool calling is essential to many modern agentic use cases. Yet, out-of-the-box models often struggle with tool calling: hallucinating parameters, selecting incorrect functions, or failing to follow multi-step sequences. In tool calling workflows, even small inconsistencies can cascade into downstream failures. Our fine-tuning service now delivers an end-to-end solution for reliable, production-grade tool calling, spanning fine-tuning through inference. Tool calls can be included in training data using the OpenAI-compatible schema. Functions are defined in a top-level tools array, and our service validates that every tool_calls entry matches a declared tool, ensuring structurally correct data before training begins. At inference time, we’ve significantly improved tool call reliability to ensure the benefits of tool call fine-tuning translate into production performance. Enhanced parsing and validation improve correctness across a wide range of real-world use cases, supported by inference tool calling datasets curated from both community contributions and internal research. Tool-call fine-tuning is available for models from Qwen, Moonshot AI, and Z.AI. See the tool calling documentation to get started. To see an example of tool call functionality in code, take a look at our cookbook . Reasoning Fine-tuning Reasoning models generate intermediate thinking traces before producing a final answer, enabling step-by-step reasoning. However, reasoning formats are not standardized across models, introducing complexity into the reasoning fine-tuning process. Together Fine-tuning now supports fine-tuning directly on thinking traces using a reasoning or reasoning_content field in assistant messages. This lets you train models on domain-specific reasoning patterns while keeping traces structured and reproducible. As with tool calling, we have improved reasoning inference to ensure that fine-tuned capabilities translate into reliable downstream performance. Reasoning fine-tuning is available for models from Qwen and Z.AI. See our documentation page for supported models and details. For an end-to-end code demo of reasoning fine-tuning, check out our cookbook Vision-Language Model Fine-tuning Many AI workflows require models that can interpret image inputs. For domain-specific tasks like medical imaging and eCommerce, vision-language models (VLMs) may need to learn new visual patterns to be effective. Together Fine-tuning service now supports fine-tuning of vision-language models. Vision training data is provided inline using message content arrays with base64 encoded images. Fine-tuning jobs support hybrid datasets, allowing both image-text examples and text-only examples within the same run. By default, we freeze the vision encoder and update only the language layers. Setting train_vision=true enables joint training, allowing updates to both the vision encoder and language layers. VLM fine-tuning is available for models from Qwen, Google, and Meta. See the vision-language documentation for the supported list and usage details. You can also check out our cookbook for vision language fine-tuning here . Large Model Fine-tuning As the sizes of open models grow and context windows expand, the underlying training infrastructure has to keep pace. Trillion-parameter models cannot fit on a single node, thus needing careful communication and memory management across multiple machines. Even a…

Excerpt shown — open the source for the full document.

Notability

notability 7.0/10

Notable service expansion with key AI capabilities.