Ultravox
Ultravox is Fixie.ai's real-time voice AI platform for developers who need agents that speak, listen, and respond in live conversations. It targets teams building phone bots, in-app voice assistants, and other speech interfaces where lag or robotic delivery kills the experience. The product sits behind several high-profile voice agents, with APIs and SDKs meant for production deployment rather than demos alone.
Most voice stacks transcribe speech to text before an LLM can reason, which adds latency and strips tone, cadence, and pitch from the signal. Ultravox takes a speech-native route: its own model reads audio directly, and the company runs dedicated inference infrastructure instead of chaining external LLM pools.
The platform ships REST APIs, dev kits for web and mobile, built-in telephony hooks, and tooling for RAG corpora and custom voice clones. Ultravox v0.7 leads on Big Bench Audio benchmarks, and the team publishes open-weight models on Hugging Face alongside research on turn-taking via UltraVAD.
Voice AI engineers, SaaS founders, and support automation teams use Ultravox when they need sub-second responses and natural back-and-forth over phone or browser audio, not just chat-style text bots.
Speech-native model skips transcription so tone and cadence stay in the loop
Ultravox v0.7 hits 91.8% on Big Bench Audio, 97% with reasoning enabled
Own inference stack keeps latency low without waiting on external LLM pools
REST APIs plus SDKs for web and mobile, with major telephony provider hooks
UltraVAD v0.1 predicts turn endings instead of cutting users off mid-thought
Open-weight models on Hugging Face under the fixie-ai org
Speech-native design avoids STT latency and keeps tone and cadence in the conversation.
Pay as You Go includes 30 free minutes per month with no subscription required.
Open-weight models on Hugging Face give teams visibility into the underlying speech stack.
Pro tier removes hard concurrency caps for production voice workloads.
Pay as You Go limits you to five concurrent calls.
Enterprise pricing and advanced limits require contacting sales.
Dedicated speech generation model is still marked as coming soon on the site.
Is Ultravox free to use?
Yes. Ultravox offers a Pay as You Go plan at $0 per month with 30 minutes of free calls, unlimited playground usage, and no surge pricing. Usage beyond the free minutes is billed at $0.05 per minute including TTS.
How much does Ultravox cost per minute?
Ultravox charges $0.05 per minute on Pay as You Go and Pro plans, with TTS included. SIP calls add $0.005 per minute on Pay as You Go or $0.0048 per minute on Pro. Thread token usage is $2 per million uncached input tokens and $15 per million output tokens.
What is Ultravox speech-native approach?
Ultravox processes audio with a speech-native model instead of converting speech to text first. That design reduces latency before inference starts and preserves paralinguistic signals like tone, cadence, and pitch that transcription typically drops.
Does Ultravox support telephony integrations?
Yes. Ultravox includes built-in integrations with major telephony providers and SIP pricing on its plans. The Pro tier adds an outbound call scheduler for scheduled voice campaigns.
What models does Ultravox offer?
Ultravox v0.7 is the core speech model, scoring 91.8% on Big Bench Audio without reasoning and 97% with thinking enabled. UltraVAD v0.1 handles dynamic endpointing and turn-taking. Open-weight releases are on Hugging Face at huggingface.co/fixie-ai.
How many concurrent calls can I run on Ultravox?
Pay as You Go on Ultravox caps concurrent calls at five. The Pro plan at $100 per month removes hard concurrency limits, and Enterprise plans offer customizable concurrency with priority SLA options.

