Skip to main content
Connect Vision Agents to PSTN phone calls. A telephony provider handles the phone network; your FastAPI server receives webhooks, streams audio over WebSocket, and bridges it into a Stream call where your agent runs.
Vision Agents requires a Stream account for real-time transport.

Choose a provider

Twilio

Media Streams, TwiML, built-in webhook helpers — default in our examples

Telnyx

Call Control API, bidirectional media streaming

How phone agents work

Every phone integration follows the same pattern, regardless of provider:
  1. Webhook — the provider notifies your server of an incoming call (or you initiate an outbound call via REST API)
  2. Register — create a call entry in a provider registry with a secret token
  3. Stream URL — tell the provider to open a WebSocket media stream to your server
  4. Bridgeattach_phone_to_call connects phone audio ↔ a Stream call where your agent listens and responds

Shared building blocks

Both telephony plugins expose the same core primitives:
Building blockRole
Call registryTracks active calls, assigns tokens, optional async prepare to warm up the agent
Media streamWebSocket handler that receives/sends phone audio
attach_phone_to_callBridges provider audio ↔ Stream WebRTC participant
Tokenized media URLProvider connects to wss://<host>/<provider>/media/{call_id}/{token}
Provider-specific differences:
TwilioTelnyx
Answer mechanismReturn TwiML with <Stream>Call Control Answer/Dial API with stream_url
Webhook authverify_twilio_signature (built-in)Ed25519 verification (see plugin examples)
Audio encodingmulaw 8 kHzPCMU/PCMA 8 kHz or L16 16 kHz

Prerequisites

Before starting any phone example:
  • Stream API key and secret
  • A telephony provider account with at least one phone number
  • ngrok (or another HTTPS tunnel) for local development
  • A public webhook URL pointing at your FastAPI server
Prefer managed hosting? Stream Voice AI runs production voice agents on Stream’s global edge, phone numbers, web and mobile clients, co-located STT/LLM/TTS, and built-in observability. Join the waitlist for early access.

Learning path

Follow this order:
  1. Twilio Phone Agent — get your first inbound and outbound call working
  2. Phone Support Agent (RAG) — add knowledge retrieval on top
  3. Twilio or Telnyx integration pages — API reference when building your own server

Twilio Phone Agent

Step-by-step phone tutorial

Phone Support Agent (RAG)

Add knowledge retrieval to calls

Quick orientation

Expose your local server

Run ngrok http 8000 and note the HTTPS hostname for NGROK_URL.

Point webhooks at your server

Configure your provider to send call events to your FastAPI endpoints (e.g. https://<NGROK_URL>/twilio/voice for Twilio).

Run an example and call your number

Follow the Twilio Phone Agent tutorial for full commands. Dial your Twilio number to verify end-to-end audio.

Next Steps

Twilio Integration

Plugin API reference

Telnyx Integration

Alternative provider

RAG for Agents

Knowledge retrieval backends

Stream Video RTC

Default edge transport

Build a Voice Agent

Voice agent fundamentals