Evalgent
Back to Blog
Voice AI Evaluation

AI voice agent cost: what a voice agent really costs per minute

Deepesh Jayal
11 min read
AI voice agent cost: what a voice agent really costs per minute

Teams shopping for a voice agent fixate on the platform's headline rate, then get surprised by the bill, or by how much prices have dropped. The platform fee is one slice of the AI voice agent cost, not the whole thing, and the whole thing is cheaper than it was a year ago. This guide breaks the cost into its five parts, gives realistic 2026 ranges, and shows how to budget for it and cut it down. Prices move fast, so treat the numbers as ranges and confirm on the official pages.

Evalgent sits on the testing side, so we keep the product light here and focus on the economics. First, the anatomy of the number.

What makes up voice agent cost?

A voice agent makes several service calls for every minute of conversation, and each one costs money. The per-minute total is the sum of these layers.

Cost layerTypical 2026 rangeWhat it covers
Speech-to-text (STT)~$0.01/mintranscribing caller audio
LLM~$0.003–$0.06/minreasoning and responses
Text-to-speech (TTS)~$0.02–$0.05/mingenerating the agent's voice
Telephony~$0.003–$0.02/mincarrying the call (or $0 with own SIP)
Platform / orchestration~$0.03–$0.05/minconnecting the pieces

Add them up. A build-your-own stack typically lands around $0.11 to $0.15 per minute with standard models and voices. Newer bundled platforms collapse the layers into one fee as low as $0.03 to $0.05. The voice agent stack guide explains what each layer does. The LLM and the voice are the widest swings, so model and TTS choice move the bill most.

What is the cost per minute of a voice agent?

The voice agent cost per minute depends on how you build. Bundled platforms charge one all-in rate. Build-your-own platforms charge a base fee plus the providers you pick. Self-hosting removes the platform fee but adds infrastructure and engineering.

Overall, the voice ai cost has dropped as competition grew. The 2026 map has three tiers. Bundled all-in offerings are cheapest: Plivo prices its AI agent at $0.03 per minute plus telephony, per the Plivo pricing page, and xAI's Grok Voice bundles the model and voice at $0.05 per minute, per x.ai/voice. Build-your-own platforms like Vapi land around $0.11 to $0.15 all-in with a standard stack. Self-hosting on a framework can go under $0.05 per minute, plus your own compute.

The direction is clear. New entrants keep pushing the floor down. A year ago, a build-your-own stack that cost $0.25 is closer to $0.12 today. Bundled players like Plivo and xAI now undercut that further. Expect the numbers to keep falling as models get cheaper and competition grows, so re-check rates often.

How much does Vapi or Retell cost per minute?

These build-your-own platforms are a common shortlist, and their pricing shows the pattern. Both charge a base platform fee, then pass through the models and telephony you use.

Vapi's base orchestration fee is around $0.05 per minute, the lowest of the majors. A standard stack lands about $0.11 to $0.15 per minute all-in, per the Vapi pricing page. Only a premium configuration, a top model plus a multilingual voice, pushes toward $0.30 or more. Retell starts near $0.07 to $0.08 per minute for its voice engine, with a similar all-in range, per the Retell pricing page. ElevenLabs prices its agent minutes around $0.08, with the LLM passed through separately. For the full trade-off, see our Vapi vs Retell comparison.

Is it cheaper to self-host a voice agent?

At high volume, yes, but not for free. Self-hosting on a framework removes the platform fee. Your per-minute cost drops to the providers plus your own compute. The catch is that you now own latency tuning, telephony, uptime, and on-call.

The economics flip at scale. Below roughly ten thousand minutes a month, a bundled or managed platform is usually cheaper once you count engineering time. Above fifty thousand, self-hosting can cut per-minute cost sharply, which our LiveKit vs Vapi guide explores. The right answer is a total-cost-of-ownership calculation, not just the per-minute rate. Include the people who run the stack. Include on-call time. Include the risk of downtime. A low per-minute rate that needs a full-time engineer is not cheap. Model the whole cost, then decide.

What drives voice agent cost up or down?

A few levers move the number more than others. Knowing them lets you budget and optimise deliberately.

  • LLM choice: a wide swing; a cheaper or smaller model can halve the LLM line.
  • Reasoning mode: turning on model reasoning adds tokens and cost per turn.
  • Voice quality: premium and multilingual TTS voices cost more than standard ones.
  • Bundled vs build-your-own: a bundled all-in platform can be cheaper than assembling parts.
  • Telephony routing: bringing your own SIP can drop telephony toward zero.
  • Concurrency and scale: volume discounts and concurrency pricing matter at scale.
  • Call length: shorter, well-designed calls cost less; latency-driven repeats cost more.

The best LLM for voice agents guide covers the model side, which is usually the biggest cost lever, alongside the voice.

Reading voice agent pricing correctly

Published ai voice agent pricing can mislead. Some vendors quote a base fee that is the smallest part. Others quote a bundled all-in rate that already includes everything. Read which one you are looking at before you compare.

So when someone asks how much does a voice agent cost, the honest answer is that it depends on your stack and your platform model. The cost of voice agents is not one price. It is your models, your voices, your telephony, and your volume, added up, or one bundled fee that hides those parts.

Two calls can cost very differently. A short call on a cheap model with a standard voice is inexpensive. A long call on a premium multilingual voice costs several times more. The gap between the two is large. The audio path matters too. A poor codec can force retries, and every retry is paid minutes. Read every quote as a floor or a bundle, then build your own estimate from the layers.

How do you budget for a voice agent?

Budgeting starts with your own numbers, not a vendor's headline rate. Estimate average call length, monthly call volume, and the models and voices you will actually use. Then build the per-minute total from the layers above, or take a bundled rate if you use an all-in platform.

Work from a realistic per-minute figure. Multiply by expected minutes. Add a margin for calls that run long or repeat. Then model two scenarios: bundled or managed now, self-hosted later, so you know where the crossover sits for your volume. A voice agent cost breakdown that is a year old is usually wrong. Prices fall as models get cheaper and competition grows. Re-run the estimate each quarter. It takes an hour and prevents a nasty surprise on the invoice.

The cost most teams forget: failed calls

Here is the line item that never appears on a pricing page. A failed call still costs you the per-minute fee, and then costs you again in a churned caller, a support escalation, or a lost sale. The cheapest per-minute agent is expensive if it fails often.

This is where testing pays for itself. Catching failures before production means fewer wasted minutes and fewer costly escalations, which is the real return on testing, covered in our piece on the ROI of voice agent testing. Evalgent runs realistic calls before launch so the minutes you pay for are minutes that actually complete. Cost per minute matters, but cost per successful call matters far more in practice. A completed call earns its fee back. A failed one pays twice: once for the minute, once for the caller you lost.

Frequently asked questions

How much does an AI voice agent cost?

In 2026, an AI voice agent costs as little as $0.03 to $0.05 per minute on bundled all-in platforms, around $0.11 to $0.15 on a typical build-your-own stack, and under $0.05 self-hosted at scale. Only premium configurations with top models and multilingual voices reach $0.30 or more. The cost is made of speech-to-text, the LLM, text-to-speech, telephony, and a platform fee.

What is the cost per minute of a voice agent?

The cost per minute of a voice agent is the sum of five layers: speech-to-text around $0.01, the LLM $0.003 to $0.06, text-to-speech $0.02 to $0.05, telephony $0.003 to $0.02, and a platform fee $0.03 to $0.05. A build-your-own stack lands near $0.11 to $0.15, while bundled all-in platforms start at $0.03 to $0.05 per minute.

What makes up AI voice agent cost?

AI voice agent cost is made of speech-to-text, the LLM, text-to-speech, telephony, and a platform or orchestration fee. Each runs for every minute of conversation, so the per-minute total is their sum. The LLM and the voice vary most, so model and TTS choice move the bill most. Bundled platforms collapse these layers into one all-in fee instead of separate line items.

How do you reduce voice agent cost?

Reduce voice agent cost by choosing a cheaper or smaller LLM where quality allows, keeping model reasoning off for the live turn, using standard rather than premium voices, and routing telephony through your own SIP. A bundled all-in platform can undercut a build-your-own stack. At high volume, self-hosting cuts the platform fee. Shorter, well-designed calls also lower cost, since latency-driven repeats waste paid minutes.

Is it cheaper to self-host a voice agent?

Self-hosting is cheaper per minute at high volume, but not for free. It removes the platform fee, leaving provider and compute costs, but you take on latency tuning, telephony, uptime, and on-call. Below roughly ten thousand minutes a month a bundled or managed platform usually wins once engineering time is counted; above fifty thousand, self-hosting can save a large margin.

How much does Vapi or Retell cost per minute?

Vapi's base orchestration fee is around $0.05 per minute, with a standard stack landing about $0.11 to $0.15 all-in; only premium multilingual voices push it toward $0.30 or more. Retell starts near $0.07 to $0.08 per minute for its voice engine, with a similar all-in range. Confirm current rates on each platform's pricing page, since they change often.

How do you budget for a voice agent?

Budget for a voice agent by estimating average call length, monthly call volume, and the models and voices you will use, then building the per-minute total from the cost layers, or taking a bundled rate. Multiply by expected minutes and add margin for long or repeated calls. Model bundled versus self-hosted so you know the crossover, and revisit when prices or volume change.

Does testing add to voice agent cost?

Testing adds a small upfront cost and removes a larger ongoing one. Failed calls in production still cost the per-minute fee, plus churned callers and escalations, so catching failures before launch saves money overall. Pre-release testing turns wasted, failing minutes into completing ones, which is why cost per successful call matters more than the raw per-minute rate.

Conclusion

AI voice agent cost is a five-layer per-minute figure, not a single platform fee, and in 2026 it is lower than most teams expect. Bundled platforms start at $0.03 to $0.05 per minute. A typical build-your-own stack lands near $0.11 to $0.15. Only premium voices reach $0.30 or more. The LLM and the voice are the biggest levers.

Budget from your own volume and model choices, not a headline rate. And remember the hidden cost: a failed call is paid for twice, so the cheapest agent is the one that actually completes the calls you pay for.

Related Articles