Expanding Together AI Model Library into multimedia generation with 40+ new image and video models
Captured source
source ↗Expanding Together AI Model Library into multimedia generation with 40+ new image and video models
⚡️ FlashAttention-4: up to 1.3× faster than cuDNN on NVIDIA Blackwell →
Introducing Together AI's new look →
🔎 ATLAS: runtime-learning accelerators delivering up to 4x faster LLM inference →
⚡ Together GPU Clusters: self-service NVIDIA GPUs, now generally available →
📦 Batch Inference API: Process billions of tokens at 50% lower cost for most models →
🪛 Fine-Tuning Platform Upgrades: Larger Models, Longer Contexts →
All blog posts
Model Library
Published 10/21/2025
Expanding Together AI Model Library into multimedia generation with 40+ new image and video models
Build complete multimodal applications with video, image, and text generation through unified APIs.
Authors
Justin Driemeyer, Necoline Hubner, Derek Petersen, Blaine Kasten, Rishabh Bhargava, Sonny Khan
Table of contents
40+ Models Chosen for Production...40+ Models Chosen for Production...40+ Models Chosen for Production...
Links in this article
Interactive Playground API Documentation Model Library
What's New
New video generation API with models like OpenAI Sora 2 , Google Veo 3.0 , and Minimax Hailuo for high-quality video creation 40+ new image and video models , including Google's Imagen and Nano Banana , ByteDance SeeDream , and specialized editing tools Complete workflows - Combine text, image, and video generation in single applications without switching providers Same APIs you know - OpenAI-compatible endpoints, unified auth, transparent per-model pricing Available now : Serverless endpoints with enterprise options for scale
Generative media is at the center of a new set of AI-native applications, from AI-powered video editors and personalized gaming experiences to automated marketing content. But building these apps has been complex, with developers having to juggle providers for text, images, and video—each with new SDKs, auth, rate limits, and billing. That fragmentation slows teams, complicates SLAs, and makes scaling a headache. Today Together AI, the AI Native Cloud, is expanding the Together Model Libary to become your complete generative media infrastructure. Through our strategic partnership with Runware , we're integrating 20+ video models across six providers (including Google Veo 3.0, OpenAI Sora 2, and ByteDance Seedream) plus 15+ image models alongside leading LLMs and voice—spanning the quality-speed-cost spectrum that real applications demand, all accessible through the same fast, reliable APIs you use for text generation. 40+ Models Chosen for Production Workflows New Video Generation Models Video generation is new to Together AI. We're starting with models that create 4-30 second videos at various resolutions and styles. Each model optimizes for different needs - realism, motion consistency, or extended length. From quick 10-second clips with Minimax Hailuo to extended 30-second sequences with Kling v2.1, and specialized motion generation with SeeDance. This variety ensures developers can choose the right tool for their specific video generation requirements, from rapid prototyping to production-quality content creation.
Sora 2 Pro
8s
Your browser does not support the video tag.
Premium cinematic video generation with native audio and lifelike physics.
$2.40/video (720p/8s)
Try now
Google Veo 3
8s
Your browser does not support the video tag.
High-quality video creation with advanced camera movements and scene control.
$1.60/video (720p/8s)
Try now
PixVerse V5
5s
Your browser does not support the video tag.
Fast, affordable video generation with smooth motion and multiple artistic styles.
$0.30/video (1080p/5s)
Try now
ByteDance Seedance 1.0 Pro
5s
Your browser does not support the video tag.
Top-ranked video generation with multi-shot storytelling and cinematic quality.
$0.57/video (1080p/5s)
Try now
New Image Generation & Editing Models Together AI's image generation capabilities span the full spectrum of creative and production needs. From photorealistic generation with Google's Imagen to artistic control with models like Nano Banana, developers get access to specialized tools optimized for different use cases without researching individual providers or managing separate integrations. Gemini Flash Image 2.5 (Nano Banana)
Versatile image creation and editing with natural language control. $0.039/image
Try now
Google Imagen 4.0 Ultra
Premium image generation with exceptional detail and text rendering. $0.06/image
Try now
Qwen Image
High-quality image generation with perfect text integration and poster design. $0.0058/image
Try now
34+ More Models
Complete range of specialized models for every creative and production use case. From $0.0006/image
Browse all
Build Complete Workflows in One Platform Combine text, image, and video generation in a single codebase without managing multiple providers. Your existing Together integration gains image editing, creative generation, and video production capabilities. Here are three types of applications this makes practical to build: 🎮 Media Generation in Gaming
Technical capability: Gaming studios generating environmental assets, character variations, and cutscenes programmatically based on gameplay data. Platform advantage: Single API call chain from game state to visual assets, enabling real-time content generation without managing multiple inference providers.
🛍️ Dynamic Advertising Creative
Technical capability: E-commerce platforms generating personalized product images, lifestyle shots, and video ads based on user preferences, seasonal trends, and inventory data. Platform advantage: Real-time creative generation from user data to personalized visuals, enabling dynamic ad optimization without coordinating separate image and video providers.
🧠 Interactive Learning Platforms
Technical capability: Educational applications creating custom visual explanations, interactive diagrams, and personalized video content based on student questions and progress. Platform advantage: Real-time multimodal responses using the same inference infrastructure, enabling sophisticated personalization without latency penalties from provider switching.
Production Deployment Options Together AI's generative media capabilities are production-ready with enterprise-grade infrastructure and developer-focused tools. Performance & Scale
✔ 40+ image and video models ✔ Up to 30-second video generation ✔…
Excerpt shown — open the source for the full document.
Notability
notability 8.0/10Notable expansion with many models