LiveKit Agents: Real-time media infrastructure platform with an integrated agent framework for building voice and video AI assistants that can participate in live conversations. Enables developers to create AI agents that can see, hear, and speak in real-time video calls, with support for spatial audio, screen sharing, and multi-participant interactions.
Build AI agents that join voice and video calls — your AI can talk, listen, and see in real-time conversations.
LiveKit Agents is an open-source framework for building real-time, multimodal AI agents that can see, hear, and speak. Built on top of LiveKit's WebRTC infrastructure, it provides the transport layer and developer framework needed to create voice agents, video AI assistants, and other real-time AI applications that interact with users through audio and video streams rather than just text.
The framework's architecture centers on a worker process model where agent code runs as "workers" that connect to LiveKit rooms. When a user joins a room, the agent worker is dispatched to participate alongside them, receiving audio/video tracks and sending responses back in real-time. This design handles the complex WebRTC plumbing — media encoding/decoding, network adaptation, echo cancellation — so developers can focus on the AI logic.
LiveKit Agents provides a plugin system for integrating with AI services at each stage of the voice pipeline: Speech-to-Text (Deepgram, Google, AssemblyAI, Azure), LLMs (OpenAI, Anthropic, Google Gemini, local models), and Text-to-Speech (ElevenLabs, Cartesia, PlayHT, Azure). The framework handles the orchestration between these components, including critical details like Voice Activity Detection (VAD), interruption handling, and turn-taking that make voice conversations feel natural rather than robotic.
A key technical differentiator is LiveKit's approach to latency. The framework supports "speech-to-speech" pipelines where audio goes directly to multimodal models (like GPT-4o Realtime) without intermediate transcription, achieving sub-second response times. For traditional STT→LLM→TTS pipelines, it implements streaming at every stage — the LLM starts generating while transcription finishes, and TTS starts speaking while the LLM is still generating — minimizing perceived latency.
The platform is fully open-source (Apache 2.0) with the agent framework, server, and client SDKs all available on GitHub. LiveKit Cloud provides managed infrastructure for teams that don't want to operate their own WebRTC servers, with a free tier for development. Self-hosting is straightforward with Docker or Kubernetes, giving teams full control over their data and infrastructure.
For production deployments, LiveKit Agents supports horizontal scaling across multiple worker processes, health monitoring, graceful shutdown, and automatic reconnection. The framework includes built-in support for function calling, allowing voice agents to execute tools and access external systems during conversations. This makes it suitable for building production voice AI applications like customer service agents, AI tutors, telehealth assistants, and meeting copilots.
Was this helpful?
LiveKit Agents receives strong marks for being fully open-source with production-quality WebRTC infrastructure. Developers appreciate the plugin architecture and the quality of voice agent experiences it enables. The main complaints are steep learning curve for WebRTC concepts, documentation gaps for advanced use cases, and the complexity of self-hosting the full stack at scale.
Orchestrates Speech-to-Text, LLM inference, and Text-to-Speech in a streaming pipeline. LiveKit Agents streams output at every stage — LLM starts generating while transcription finishes, TTS begins speaking while the LLM is still responding — achieving 500-800ms total latency vs 2-3 seconds with naive sequential implementations.
Supports direct audio-to-audio pipelines via OpenAI GPT-4o Realtime API, bypassing intermediate transcription entirely. Achieves sub-300ms response times and preserves emotional tone and prosody that text-based pipelines lose during transcription and synthesis.
Built-in VAD using Silero models detects when users start or stop speaking. The framework gracefully handles interruptions — when a user speaks mid-response, the agent stops immediately, processes the interruption, and responds naturally without requiring custom state machine logic.
Interchangeable plugins for STT (Deepgram, Google, AssemblyAI, Azure, Whisper), LLM (OpenAI, Anthropic, Google Gemini, local Ollama), and TTS (ElevenLabs, Cartesia, PlayHT, Azure, Google). Switch providers with a single configuration change without refactoring agent logic.
Native SIP trunk integration connects voice agents to the public telephone network. Build inbound IVR systems, outbound calling campaigns, or call center bots that interact with regular phone calls — no separate telephony SDK or Twilio-level abstraction required.
Agents invoke tools and external APIs mid-conversation. The framework manages async tool execution while maintaining conversation state, allowing voice agents to look up CRM data, book appointments, trigger workflows, and return results as natural spoken responses.
Contact for pricing
Contact for pricing
Contact for pricing
Ready to get started with LiveKit Agents?
View Pricing Options →LiveKit Agents works with these platforms and services:
We believe in transparent reviews. Here's what LiveKit Agents doesn't handle well:
Weekly insights on the latest AI tools, features, and trends delivered to your inbox.
Voice Agents
Build production-ready voice AI agents with modular STT, LLM, and TTS components - developers control every aspect of real-time conversation pipelines for phone and web deployment
Voice Agents
Voice AI platform for building conversational phone agents with human-like speech, ultra-low latency, and natural turn-taking for call center automation.
Voice Agents
Enterprise conversational AI platform for building voice agents that handle inbound and outbound phone calls with sub-300ms latency, warm transfers, and comprehensive telephony integrations.
No reviews yet. Be the first to share your experience!
Get started with LiveKit Agents and see if it's the right fit for your needs.
Get Started →Take our 60-second quiz to get personalized tool recommendations
Find Your Perfect AI Stack →Explore 20 ready-to-deploy AI agent templates for sales, support, dev, research, and operations.
Browse Agent Templates →