Voice AI Is Having Its Moment: $586M in Funding and a Massive IBM Partnership

In the past four months, voice AI companies have raised over half a billion dollars. ElevenLabs closed a $500 million Series D at an $11 billion valuation. PolyAI pulled in $86 million to scale its enterprise voice agents. IBM partnered with ElevenLabs to bring natural-sounding voice to its watsonx enterprise platform.
That is not a trend. That is a stampede.
And it makes sense. Text-based chatbots have proven their value for customer support, lead qualification, and sales. But there is a massive chunk of customer interactions that still happen over the phone, and until recently, the AI options for handling those calls ranged from "terrible IVR menu" to "slightly less terrible IVR menu."
That is changing fast.
The Money Trail
Let us start with the numbers, because they tell a story.
ElevenLabs raised $500 million in February 2026, led by Sequoia Capital. Their valuation jumped from $3.3 billion to $11 billion in just twelve months, tripling in a single year. They closed 2025 at $330 million in annual recurring revenue. The company's voice technology supports over 10,000 voices across 70 languages, and enterprise clients include Deutsche Telekom and Deliveroo.
PolyAI raised $86 million in December 2025 in a round that included NVIDIA's venture arm, Zendesk Ventures, and Citi Ventures. They have over 2,000 live deployments across 45 languages in 25 countries, with enterprise customers like Marriott, Caesars Entertainment, and Foot Locker. A Forrester study found PolyAI customers achieve a 391% return on investment.
These are not speculative bets on early-stage startups. These are growth-stage rounds backing companies with real revenue, real enterprise customers, and real measurable ROI.
Why Voice Matters Now
Text chatbots handle a lot. But there are scenarios where voice is simply better.
A customer calling their bank about a suspicious transaction does not want to type out the details. A hotel guest calling the front desk at midnight about a broken AC wants to talk, not text. A patient calling a clinic to reschedule an appointment wants the interaction to feel human, not like filling out a web form.
The breakthrough in voice AI is that these conversations can now feel genuinely natural. We are past the robotic "I did not understand that, please say it again" era. Modern voice AI from companies like ElevenLabs and PolyAI handles interruptions, understands context, picks up on emotional cues, and responds with appropriate tone and pacing across dozens of languages.
The IBM-ElevenLabs partnership, announced in March 2026, underscores how seriously enterprises are taking this. IBM is integrating ElevenLabs' text-to-speech and speech-to-text into watsonx Orchestrate, its agentic AI platform. The integration includes enterprise-grade protections: PCI compliance for payment processing, zero-retention mode for HIPAA-compliant healthcare data, and data residency controls. Banks, insurance companies, healthcare providers, and utilities are the primary targets.
This is not consumer novelty. This is enterprise infrastructure.
What This Means for Customer-Facing Businesses
If you are running a business that handles customer calls, here is why you should be paying attention:
Phone support is not going away. Despite the growth of chat and messaging, a significant portion of customer interactions still happen by phone, especially in industries like healthcare, financial services, and hospitality. Voice AI lets you handle those calls at scale without sacrificing quality.
The economics are compelling. PolyAI customers are seeing an average of $10.3 million in savings according to Forrester. When a voice agent can handle 60-70% of inbound calls, from appointment scheduling to order status to basic troubleshooting, the math gets very attractive very quickly.
Multilingual support becomes trivial. Hiring bilingual or multilingual phone agents is expensive and difficult. Voice AI that handles 70 languages natively eliminates that constraint overnight. For businesses with international customers, this is transformative. We have seen similar benefits play out in the hotel industry, where multilingual chatbots are becoming a genuine competitive edge.
Text and Voice Are Converging
Here is where it gets really interesting: the distinction between text chatbots and voice agents is blurring. The best customer experience platforms are heading toward omnichannel AI that can handle a conversation whether it starts as a website chat, moves to a phone call, or continues via WhatsApp.
A customer might start chatting on your website, get a callback from an AI voice agent to walk through a complex issue, and then receive a follow-up text summary. Same AI, same context, different channels.
This is already happening. PolyAI calls its platform "voice-first, omnichannel." IBM's watsonx Orchestrate handles multiple interaction types. The thread connecting all of it is that the underlying AI can understand intent, maintain context, and respond naturally, regardless of whether the input is text or speech.
For businesses already using text-based AI support, voice is the natural next step. If you are not there yet and want to start with the foundation, our guide on how to add an AI chatbot to your website is a good starting point. Get text right first, then layer on voice when you are ready.
What to Watch
The voice AI space is moving fast enough that the landscape could look very different by the end of 2026. A few things to track:
Consolidation is coming. With this much funding flowing in, expect acquisitions. Customer support platforms that do not have a voice strategy will need to buy one. Quality will differentiate. As more companies deploy voice AI, customers will quickly learn the difference between a good voice agent and a bad one. The same frustration dynamic that plagues bad text-based chatbot implementations will play out in voice, just louder (literally).
And the companies that integrate voice and text into a seamless experience will win over those that treat them as separate channels.
The era of "press 1 for billing, press 2 for support" is ending. What replaces it is going to be a lot more interesting. If you are building out your customer support stack and want to start with AI that actually works, give Converzoy a try.
