This document introduces the LiveKit Agents framework, a Python library for building realtime, programmable voice AI agents that run on servers. It explains the framework's purpose, architecture, and key components.
For information about installing and running your first agent, see Quick Start. For details on the monorepo structure and version management, see Project Structure and Versioning. For in-depth coverage of the runtime architecture, see Core Architecture.
LiveKit Agents is a framework for building conversational, multi-modal voice agents that can see, hear, and understand. The framework provides:
lk-agents CLI. README.md137The framework handles low-level streaming logic, turn detection, and tool orchestration so developers can focus on defining agent instructions and implementing function tools.
Sources: README.md23-44 livekit-agents/livekit/agents/worker.py121-125 livekit-agents/livekit/agents/voice/__init__.py70-94
LiveKit Agents uses a layered architecture organized by functional concerns: deployment infrastructure, agent orchestration, voice interaction processing, I/O management, AI provider abstraction, and plugin ecosystem.
The architecture separates deployment infrastructure from agent logic, and isolates provider-specific implementations into plugins. AgentSession serves as the central orchestrator binding all layers together.
Sources: livekit-agents/livekit/agents/worker.py121-125 livekit-agents/livekit/agents/voice/agent_session.py200 livekit-agents/livekit/agents/voice/agent_activity.py139-141 livekit-agents/livekit/agents/voice/room_io.py101
This sequence shows the complete agent lifecycle. The AgentServer registers with LiveKit and receives job assignments, delegating execution to isolated JobProcess instances.
Sources: livekit-agents/livekit/agents/worker.py121-125 livekit-agents/livekit/agents/job.py36-39 README.md108-137
| Layer | Key Components | Responsibilities |
|---|---|---|
| Deployment | AgentServer, WorkerOptions, JobProcess | Process management, job distribution, worker registration livekit-agents/livekit/agents/worker.py121-125 |
| Orchestration | AgentSession, AgentTask, ChatContext | Conversation lifecycle, state transitions, speech scheduling livekit-agents/livekit/agents/voice/agent_session.py200 |
| Pipeline | AgentActivity, AudioRecognition, TurnHandlingOptions | Audio processing, turn detection, endpointing, interruption detection livekit-agents/livekit/agents/voice/agent_activity.py139-141 livekit-agents/livekit/agents/voice/audio_recognition.py126-140 |
| I/O | RoomIO, AudioInput, AudioOutput | LiveKit room connections, audio/video stream management livekit-agents/livekit/agents/voice/room_io.py101 |
| Abstraction | llm.LLM, stt.STT, tts.TTS | Standard interfaces for multi-provider support livekit-agents/livekit/agents/__init__.py23 |
Sources: livekit-agents/livekit/agents/worker.py121-125 livekit-agents/livekit/agents/voice/agent_session.py200 livekit-agents/livekit/agents/voice/turn.py114-119
Sources: livekit-agents/livekit/agents/worker.py121-125 livekit-agents/livekit/agents/voice/agent_session.py200 README.md137
worker.AgentServer livekit-agents/livekit/agents/worker.py121-125
job.JobContext livekit-agents/livekit/agents/job.py34-39
rtc.Room connection and job metadata.voice.AgentSession livekit-agents/livekit/agents/voice/agent_session.py200
voice.Agent livekit-agents/livekit/agents/voice/agent.py36-57
instructions and tools. livekit-agents/livekit/agents/voice/agent.py75-76llm.ChatContext livekit-agents/livekit/agents/llm/chat_context.py233-247
ChatMessage objects. Supports system, user, assistant, and tool roles. livekit-agents/livekit/agents/llm/chat_context.py37-41voice.room_io.RoomIO livekit-agents/livekit/agents/voice/room_io.py101
Sources: livekit-agents/livekit/agents/voice/agent_session.py200 livekit-agents/livekit/agents/voice/agent.py36-57 README.md71-77
The framework implements a voice processing pipeline that handles concurrent audio input processing and speech generation.
Sources: livekit-agents/livekit/agents/voice/agent_activity.py139-141 livekit-agents/livekit/agents/voice/audio_recognition.py126-140 livekit-agents/livekit/agents/voice/generation.py59-70
The framework supports complex multi-agent handoffs through the AgentHandoff and AgentTask abstractions. Tools can return an AgentHandoff object, which triggers the session to transition to a new agent while optionally maintaining conversation history. livekit-agents/livekit/agents/llm/chat_context.py43 livekit-agents/livekit/agents/voice/agent.py228
Sources: README.md146-175 livekit-agents/livekit/agents/voice/agent_session.py200
Refresh this wiki