Introducing voice finder — a new tool to quickly find the right voice for your app from over 600+ voices
Captured source
source ↗Introducing voice finder — a new tool to quickly find the right voice for your app from over 600+ voices
⚡️ FlashAttention-4: up to 1.3× faster than cuDNN on NVIDIA Blackwell →
Introducing Together AI's new look →
🔎 ATLAS: runtime-learning accelerators delivering up to 4x faster LLM inference →
⚡ Together GPU Clusters: self-service NVIDIA GPUs, now generally available →
📦 Batch Inference API: Process billions of tokens at 50% lower cost for most models →
🪛 Fine-Tuning Platform Upgrades: Larger Models, Longer Contexts →
All blog posts
Model Library
Published 5/12/2026
Introducing voice finder — a new tool to quickly find the right voice for your app from over 600+ voices
A new tool from Together AI for searching, filtering, and auditioning voices across leading TTS models.
Authors
Zain Hasan, Sonny Khan
Table of contents
40+ Models Chosen for Production...40+ Models Chosen for Production...40+ Models Chosen for Production...
Summary
Voice finder : Search 600+ voices across MiniMax, Cartesia, Deepgram, Rime, and other models available through Together AI. Search by prompt or audio: Describe the voice you need, or upload a short voice sample to find similar voices with playable recommendations. Model-aware metadata: Each voice is tagged across 15+ attributes, including pitch, accent, language, age, emotion, and speaking style.
Choosing the right voice for a voice agent is still too manual. Provider catalogs can include dozens or hundreds of voices, and the documentation rarely tells you which one fits a fintech support agent, a meditation guide, or a game show host. Voice finder gives developers a faster way to search the Together AI voice catalog. Type in what you are building or upload a short audio sample of the voice you have in mind, then compare ranked recommendations, listen inline, and filter by the attributes that matter for your use case.
How it works Voice finder indexes 600+ voices across 10 TTS models on Together AI. Each voice is playable directly in the tool. Behind the ranking layer, an omni-model has listened to every voice and generated structured metadata across 15+ dimensions, including pitch, gender, accent, language, age, emotion, and speaking style. That metadata powers both natural-language search and manual filtering. A few example searches: “a calm female voice for a meditation app” “a confident voice for a fintech support agent” “an energetic host for a game show” “a warm bilingual voice for customer service”
The goal is simple: get from a use case to a short list of voices quickly enough to keep building. Why this matters for voice agents Voice agents depend on more than model quality. The voice has to fit the product, the customer, and the moment. A healthcare intake agent, a restaurant ordering agent, and an entertainment companion should not sound interchangeable. Together AI gives teams a single platform for building real-time voice agents across STT, LLM, and TTS. The full pipeline runs co-located on one cloud, holding end-to-end latency under 500ms, fast enough for real-time turn-taking. Voice Finder makes the model-selection step easier by giving developers a faster way to explore the voices available across that stack. Get started → Try Voice finder at findtherightvoice.com → Explore the Together AI Voice Platform → Read the voice agent documentation → Contact Sales for dedicated endpoints and production deployment
Notability
notability 6.0/10Notable tool from known AI lab, useful but not frontier