LiveKit Agents: Real-time media infrastructure platform with an integrated agent framework for building voice and video AI assistants that can participate in live conversations. Enables developers to build programmable AI agents for WebRTC rooms, SIP telephony, and multimodal applications.
Build AI agents that join voice and video calls — your AI can talk, listen, and see in real-time conversations.
LiveKit Agents is a developer framework and cloud platform for building real-time AI agents that can join live voice, video, telephony, and multimodal conversations, with a free Build plan at $0/month and paid Ship and Scale tiers starting at $50/month and $500/month minimums. It is best understood as programmable real-time media infrastructure with an agent runtime layered on top: developers write agent workers, connect speech and language models through plugins or LiveKit Inference, and then deploy those agents into LiveKit rooms, WebRTC applications, or SIP phone workflows. That makes it a strong fit for teams that need more control than a turnkey phone-agent builder provides, especially when the product involves video, browser or mobile SDKs, custom frontend state, real-time interruption handling, or self-hosted media infrastructure.
The public LiveKit materials give several concrete facts that help define the product. The Build plan is $0/month and includes 1,000 free agent session minutes monthly, 1 free LiveKit phone number, agent deployment, agent observability, inference credits, the global edge network, and session metrics and analytics. The Ship plan has a published minimum of $50/month and adds team collaboration, rollback to previous agent deployments, email support, and shared production billing across projects. The Scale plan has a published minimum of $500/month and adds role-based access, metrics export APIs, region pinning, security reports / HIPAA, and inference discounts. LiveKit's pricing calculator lists agent session usage at $0.0100 per minute and telephony usage at $0.0100 per minute before model-specific inference charges. LiveKit's quotas documentation describes the free Build quota as 1,000 agent session minutes, 100,000 agent observability events, 1,000 minutes of agent audio recordings, $2.50 in LiveKit Inference credits, 50 US local inbound minutes, 1 included US local phone number, and 1,000 third-party SIP minutes.
For engineering teams, the main advantage is architectural flexibility. LiveKit Agents can use the common STT-to-LLM-to-TTS pattern, but it also supports realtime speech-to-speech model integrations where available. The AgentSession abstraction coordinates user input, model calls, tool use, output speech, observability events, turn-taking, and conversation state, while LiveKit rooms handle media tracks and participant connectivity. Teams can connect an agent to a web or mobile client through LiveKit SDKs, expose it to the phone network through SIP, or place it inside a broader video or multimodal application where the agent needs access to audio, video, data messages, or application-specific state.
The tradeoff is that LiveKit Agents is not a no-code product. Teams should expect to build and operate Python or Node.js agent code, choose and configure model providers, manage tool-calling behavior, test latency under realistic load, and monitor usage across plan fees, agent session minutes, telephony, and inference. Security-sensitive teams should also distinguish between LiveKit Cloud features and self-hosted responsibilities. LiveKit publishes SOC 2 Type II, GDPR, HIPAA BAA availability for eligible customers, encrypted WebRTC media transport, JWT-based room access, and open-source components, but deployment-specific controls such as retention, observability access, recording storage, and compliance evidence still need to be validated against the team's architecture and contract.
Was this helpful?
LiveKit Agents receives strong marks for being open-source and built on production-oriented WebRTC infrastructure. Developers value the plugin architecture, SIP support, and ability to build custom real-time voice and video agents, while nontechnical teams may find it less turnkey than hosted no-code voice-agent tools.
Orchestrates Speech-to-Text, LLM inference, and Text-to-Speech in a streaming pipeline. LiveKit Agents supports streaming components so developers can build responsive voice agents while retaining control over model choice and conversation state.
Supports direct audio-to-audio pipelines through realtime model integrations where available, reducing the need to stitch together separate transcription, text generation, and speech synthesis steps for some use cases.
Includes voice activity detection and turn-taking support so agents can respond to speech boundaries and handle user interruptions during live conversations.
Uses interchangeable plugins for speech, language, and voice providers. Teams should verify the current provider list in LiveKit documentation because supported integrations can change over time.
Native SIP trunk integration connects voice agents to the public telephone network. Build inbound IVR systems, outbound calling workflows, or call center bots that interact with regular phone callers.
Agents can invoke tools and external APIs mid-conversation. The framework manages asynchronous tool execution while maintaining conversation state, allowing voice agents to look up data, book appointments, or trigger workflow actions.
$0/month
$50/month minimum
$500/month minimum
Contact sales for contract pricing
Ready to get started with LiveKit Agents?
View Pricing Options →LiveKit Agents works with these platforms and services:
We believe in transparent reviews. Here's what LiveKit Agents doesn't handle well:
Weekly insights on the latest AI tools, features, and trends delivered to your inbox.
LiveKit's 2026 documentation and pricing materials emphasize AI voice and video agents, LiveKit Cloud agent deployment, LiveKit Inference, realtime model integrations, virtual avatars, SIP telephony, and observability for production agent sessions.
Voice AI
Vapi is the developer platform for voice AI agents — build, deploy, and scale phone agents with usage-based pricing and bring-your-own model keys.
Voice AI
Retell AI is an end-to-end platform for building, deploying and monitoring voice AI agents that handle phone calls at production scale.
Voice AI
Enterprise voice AI platform with self-hosted models, sub-second latency and large-scale phone agent infrastructure.
No reviews yet. Be the first to share your experience!
Get started with LiveKit Agents and see if it's the right fit for your needs.
Get Started →Take our 60-second quiz to get personalized tool recommendations
Find Your Perfect AI Stack →Explore 20 ready-to-deploy AI agent templates for sales, support, dev, research, and operations.
Browse Agent Templates →