DeepgramBacked by Deepgram Startup ProgramLearn more

Voice AI glossary

Plain-English definitions of every term we use. Short enough to send to a colleague, deep enough to actually answer the question.

Voice AI Agent

A voice AI agent is an AI-powered system that has real-time spoken conversations — over a phone call, a web widget or a SIP trunk — using speech recognition, a language model and speech synthesis.

Voice AI

Voice AI is the umbrella term for AI systems that understand and generate human speech in real time — powering voice assistants, phone agents, voice chatbots and real-time translation.

Conversational AI

Conversational AI is the category of AI systems designed to interact with humans in natural language, across chat, voice, email and messaging — using NLU, LLMs and tool-calling to hold multi-turn conversations that actually accomplish work.

IVR vs Voice AI

IVR is a rigid scripted decision tree (press 1 for sales). Voice AI is a natural-language agent that understands free-form speech, uses LLM reasoning, and calls tools to take real actions.

BYOK (Bring Your Own Key)

BYOK means you bring your own API keys for the LLM, STT and TTS providers, and the voice AI platform routes usage through your accounts instead of bundling the provider costs into its own pricing.

BYON (Bring Your Own Number)

BYON means you bring your own phone number — via a Twilio, Vobiz or Exotel account — and connect it to the voice AI platform via SIP, instead of renting a number from the platform itself.

SIP Trunking for Voice AI

SIP trunking is the protocol that lets a voice AI platform send and receive phone calls over the internet, connecting to the public phone network via a carrier like Twilio or Vobiz.

Voice AI Stack (ASR, STT, LLM, TTS)

The voice AI stack is a pipeline of four components: ASR/STT (speech to text), NLU/LLM (language understanding), TTS (text to speech), and the orchestration layer that glues them together in real time.

Voice AI Latency

Voice AI latency is the total time between the user finishing a sentence and hearing the agent begin to respond — the single most important quality metric for conversational voice AI.

PII (Personally Identifiable Information)

PII is any data that can identify a specific person, either on its own or combined with other information.

PII Redaction

PII redaction automatically detects and masks sensitive personal data in voice AI transcripts and logs before it gets stored.

Turn Detection

Turn detection is how a voice AI agent decides when the caller has finished speaking so it can respond at the right moment.

No-Code Voice AI Builder

A no-code voice AI builder is a visual flow editor that lets non-engineers design and deploy voice agents — with drag-and-drop blocks for conversation steps, tool calls, branching logic and knowledge-base retrieval.

Want to see voice AI in action?

Talk to a live voice agent in your browser — no signup, pick your language.

Try the live demo