Plans designed to scale with your projects
From building your first AI voice or video agent to realtime applications with millions of users and everything in between.
Estimate costs for
AI voice and video agents
Preview the per-minute cost to run an agent on LiveKit Cloud. Our plans include monthly allotments for agent session minutes, inbound calling minutes (for US local phone numbers), and inference credits to call the most popular AI models.
For detailed LLM, STT, and TTS model pricing, see Inference pricing.
For detailed provider and model API support, see Documentation.
estimated cost
LLM model prices (per minute)
- DeepSeek DeepSeek-V3: $0.0024/min
- Google Gemini 2.5 Flash: $0.0013/min
- Google Gemini 2.5 Flash-Lite: $0.0004/min
- Google Gemini 2.5 Pro: $0.0101/min
- Google Gemini 3 Flash: $0.0020/min
- Google Gemini 3.1 Flash Lite: $0.0010/min
- Google Gemini 3.1 Pro: $0.0152/min
- Moonshot AI Kimi K2.5: $0.0023/min
- OpenAI GPT-4.1: $0.0074/min
- OpenAI GPT-4.1 mini: $0.0015/min
- OpenAI GPT-4.1 nano: $0.0004/min
- OpenAI GPT-4o: $0.0093/min
- OpenAI GPT-4o mini: $0.0006/min
- OpenAI GPT-5: $0.0055/min
- OpenAI GPT-5 mini: $0.0011/min
- OpenAI GPT-5 nano: $0.0002/min
- OpenAI GPT-5.1: $0.0055/min
- OpenAI GPT-5.1: $0.0055/min
- OpenAI GPT-5.2: $0.0077/min
- OpenAI GPT-5.2 Chat: $0.0077/min
- OpenAI GPT-5.3 Chat: $0.0077/min
- OpenAI GPT-5.4: $0.0189/min
- OpenAI GPT OSS 120B: $0.0006/min
- Qwen Qwen3 235B-A22B Instruct: $0.0008/min
- OpenAI GPT Realtime: $0.0676/min
- OpenAI GPT Realtime mini: $0.0216/min
- Gemini Live 2.5 Flash Native Audio: $0.0144/min
- Gemini Live 2.5 Flash: $0.0144/min
STT model prices — Build/Ship plan (per minute)
- AssemblyAI Universal-3 Pro Streaming: $0.0075/min
- AssemblyAI Universal-Streaming: $0.0025/min
- AssemblyAI Universal-Streaming-Multilingual: $0.0025/min
- Cartesia Ink Whisper: $0.0030/min
- Deepgram Flux: $0.0077/min
- Deepgram Nova-2: $0.0058/min
- Deepgram Nova-2 Conversational AI: $0.0058/min
- Deepgram Nova-2 Medical: $0.0058/min
- Deepgram Nova-2 Phone Call: $0.0058/min
- Deepgram Nova-3 (Monolingual): $0.0077/min
- Deepgram Nova-3 Medical: $0.0077/min
- Deepgram Nova-3 (Multilingual): $0.0092/min
- ElevenLabs Scribe v2 Realtime: $0.0105/min
STT model prices — Scale plan (per minute)
- AssemblyAI Universal-3 Pro Streaming: $0.0075/min
- AssemblyAI Universal-Streaming: $0.0025/min
- AssemblyAI Universal-Streaming-Multilingual: $0.0025/min
- Cartesia Ink Whisper: $0.0023/min
- Deepgram Flux: $0.0065/min
- Deepgram Nova-2: $0.0047/min
- Deepgram Nova-2 Conversational AI: $0.0047/min
- Deepgram Nova-2 Medical: $0.0047/min
- Deepgram Nova-2 Phone Call: $0.0047/min
- Deepgram Nova-3 (Monolingual): $0.0065/min
- Deepgram Nova-3 Medical: $0.0065/min
- Deepgram Nova-3 (Multilingual): $0.0078/min
- ElevenLabs Scribe v2 Realtime: $0.0105/min
TTS model prices — Build/Ship plan (per minute)
- Cartesia Sonic: $0.0300/min
- Cartesia Sonic 2: $0.0300/min
- Cartesia Sonic 3: $0.0300/min
- Cartesia Sonic 3 (2025-10-27): $0.0300/min
- Cartesia Sonic 3 (2026-01-12): $0.0300/min
- Cartesia Sonic Turbo: $0.0300/min
- Deepgram Aura-2: $0.0180/min
- ElevenLabs Eleven Flash v2: $0.0900/min
- ElevenLabs Eleven Flash v2.5: $0.0900/min
- ElevenLabs Eleven Multilingual v2: $0.1800/min
- ElevenLabs Eleven Turbo v2: $0.0900/min
- ElevenLabs Eleven Turbo v2.5: $0.0900/min
- Inworld Inworld TTS 1: $0.0030/min
- Inworld Inworld TTS 1 Max: $0.0060/min
- Inworld Inworld TTS 1.5 Max: $0.0060/min
- Inworld Inworld TTS 1.5 Mini: $0.0030/min
- Rime Arcana: $0.0240/min
- Rime Mist: $0.0180/min
- Rime Mist v2: $0.0180/min
- xAI Text to Speech: $0.0025/min
TTS model prices — Scale plan (per minute)
- Cartesia Sonic: $0.0225/min
- Cartesia Sonic 2: $0.0225/min
- Cartesia Sonic 3: $0.0225/min
- Cartesia Sonic 3 (2025-10-27): $0.0225/min
- Cartesia Sonic 3 (2026-01-12): $0.0225/min
- Cartesia Sonic Turbo: $0.0225/min
- Deepgram Aura-2: $0.0162/min
- ElevenLabs Eleven Flash v2: $0.0360/min
- ElevenLabs Eleven Flash v2.5: $0.0360/min
- ElevenLabs Eleven Multilingual v2: $0.0720/min
- ElevenLabs Eleven Turbo v2: $0.0360/min
- ElevenLabs Eleven Turbo v2.5: $0.0360/min
- Inworld Inworld TTS 1: $0.0030/min
- Inworld Inworld TTS 1 Max: $0.0060/min
- Inworld Inworld TTS 1.5 Max: $0.0060/min
- Inworld Inworld TTS 1.5 Mini: $0.0030/min
- Rime Arcana: $0.0180/min
- Rime Mist: $0.0120/min
- Rime Mist v2: $0.0120/min
- xAI Text to Speech: $0.0025/min
Build
$0/mo
Ship
$50/mo
Scale
$500/mo
Enterprise
Custom
Inference pricing
Transcode minutes
Transcode minutes
FAQs
What's the difference between agent deployments, concurrent agent sessions, and LiveKit Inference concurrency?
An agent deployment is a running version of your agent backend hosted on LiveKit Cloud, typically with a unique prompt, set of voice AI models, and function calls. You can configure your agent to complete different tasks or workflows. Deploy separate agents when you need distinct reasoning behavior or tool access (e.g., a front-office receptionist agent to handle inbound phone calls for appointment scheduling and triage vs a back-office agent to make outbound calls to insurance providers to verify patient coverage).
A concurrent agent session is a live interaction between your agent and an end user. If your agent is handling 10 calls or conversations at the same time, that counts as 10 concurrent sessions, regardless of how many agent deployments you have on LiveKit Cloud.
LiveKit Inference concurrency refers specifically to how many AI inference requests across LLM, STT, and TTS can run at the same time through LiveKit Inference. It limits how many model calls can be processed concurrently, independent of how many agent sessions or deployments you have. The LiveKit Inference concurrency limit for each plan applies to your aggregate usage of a model type (e.g., total connections to any LiveKit Inference STT). For example, if there are 10 concurrent agent sessions running and the agent is configured to use LiveKit Inference for STT, then there are 10 concurrent STT connections.
For more information on LiveKit Cloud quotas and limits, refer to our docs.
Can I self-host LiveKit?
The LiveKit Agents framework and LiveKit media server are both completely open source and available to run locally or host on your own infrastructure.
LiveKit Cloud is the best way to run LiveKit in production, with fully managed agent deployments, built-in observability and dashboards, and ultra low-latency global media transport.
Sign up for LiveKit Cloud here, or refer to our docs on how to run LiveKit's media server locally or deploy LiveKit Agents in a custom environment.
Do you offer on-premise or private deployments?
Yes. Contact sales so we can better understand your needs.
Ready to build?
Start building a voice AI agent with a free account. Reach out to us if you're interested in custom pricing.
No credit card required • 1,000 free agent session minutes monthly