voice AI Archives - Tech | Business | Economy

OpenAI Launches New Real-Time Voice Models for Translation, Live Conversations

Joan Aimuengheuwa — Fri, 08 May 2026 08:30:41 +0000

OpenAI has launched three new voice models for developers, expanding its real-time audio tools that can speak, translate and transcribe conversations as they happen.

The company said the new tools are designed to make voice-based apps more useful in everyday situations, especially where users need software to respond naturally while carrying out tasks in real time.

At the centre of the launch is GPT-Realtime-2, a voice model OpenAI says can handle more difficult requests while keeping conversations flowing naturally.

Unlike earlier versions, the company said the model uses GPT-5-level reasoning to manage interruptions, understand context better and carry out actions during conversations.

OpenAI also unveiled GPT-Realtime-Translate, a live translation model that can translate speech from more than 70 input languages into 13 output languages.

According to the company, the system keeps pace with the speaker during conversations instead of translating after pauses or completed sentences.

The third model, GPT-Realtime-Whisper, focuses on live speech transcription. It converts spoken words into text instantly while a person is talking.

“Together, the models we are launching move real-time audio from simple call-and-response toward voice interfaces that can actually do work: listen, reason, translate, transcribe, and take action as a conversation unfolds,” OpenAI said.

Voice products have become a huge focus for technology companies as more users interact with software through speech instead of typing. OpenAI said developers want systems that can manage tasks while conversations continue naturally.

The company pointed to customer support, travel, education, media and creator platforms as some of the areas expected to benefit from the new models.

OpenAI also described three growing patterns it sees in voice-based software.

The first is “voice-to-action”, where users speak naturally and the system completes tasks on their behalf. OpenAI said property platform Zillow is building an assistant that can help users search for homes, avoid certain neighbourhood conditions and book tours through voice requests.

Another pattern is “systems-to-voice”, where software provides spoken updates automatically. OpenAI gave the example of travel apps that could alert passengers about delayed flights, new boarding gates or transfer routes without users typing commands.

The third area is “voice-to-voice”, which focuses on live multilingual conversations. OpenAI said Deutsche Telekom is developing customer support systems that translate discussions instantly while both sides continue speaking in their preferred languages.

Travel company Priceline is also working on voice-based trip management tools, according to OpenAI. Travellers could eventually book flights, change hotel reservations and receive airport updates entirely through conversation.

Alongside the broader rollout, OpenAI added several new features to GPT-Realtime-2 aimed at improving live interactions.

Developers can now enable short phrases such as “let me check that” or “one moment while I look into it” before the system completes a request. OpenAI said this gives users clearer feedback while the model processes tasks in the background.

The model can also call multiple tools at once and explain those actions aloud during conversations. OpenAI said the system may say things like “checking your calendar” or “looking that up now” while working through requests.

The company added that GPT-Realtime-2 recovers better from errors or failed requests instead of stopping conversations abruptly. It also supports a larger context window, increasing from 32K to 128K, allowing longer and more detailed conversations.

OpenAI further noted that the model has improved understanding of specialised terms, including healthcare vocabulary and proper nouns. Developers can also adjust how much reasoning power the model uses depending on the complexity of a request.

According to benchmark figures released by the company, GPT-Realtime-2 achieved higher scores than GPT-Realtime-1.5 on audio intelligence and instruction-following tests.

All three models are available through OpenAI’s Realtime API. The company said GPT-Realtime-Translate and GPT-Realtime-Whisper will be billed by the minute, while GPT-Realtime-2 pricing depends on token usage.

OpenAI said it has added safeguards to reduce misuse, including protections against spam, fraud and harmful content. The company also added that conversations can be stopped automatically if they break its safety rules.

The post OpenAI Launches New Real-Time Voice Models for Translation, Live Conversations appeared first on Tech | Business | Economy.

Deepgram Selects Penguin to Optimize AI Inference Infrastructure for Enterprise Voice AI

Staff Writer — Fri, 20 Mar 2026 10:30:55 +0000

Penguin Solutions, the AI factory platform company, recently announced a strategic collaboration with Deepgram and Dell Technologies to architect and deploy a fully optimized, production-ready infrastructure aligned to Deepgram’s demanding enterprise voice AI requirements.

By leveraging its unique expertise in designing, building, deploying, and managing AI infrastructure with Dell PowerEdge servers and Dell PowerScale storage optimized for AI workloads, Penguin Solutions delivered an optimal solution to support and enhance Deepgram’s innovative Speech-to-Text (STT), Text-to-Speech (TTS), and Voice Agent capabilities, while ensuring maximum reliability and performance.

As enterprise adoption of generative AI accelerates, organizations must adhere to stricter service level agreements (SLAs), which require infrastructure that can ensure low latency and high concurrent usage.

This Penguin-led deployment addresses these challenges by combining Deepgram’s innovative voice AI models with a purpose-built architectural design, a highly efficient deployment, and ongoing performance optimization.

“Modern AI workloads demand infrastructure that performs consistently and scales predictably under heavy loads, particularly for real-time inference applications like voice agents,” said Joe Castillo, vice president of sales at Penguin Solutions. “By partnering with Deepgram and utilizing proven Dell AI infrastructure, Penguin Solutions is delivering a validated, scalable, end-to-end architecture. Our comprehensive framework equips Deepgram with the optimized infrastructure needed to reliably and accurately deliver complex voice AI capabilities in healthcare, retail, and other industries.”

Drawing on its extensive experience with HPC and AI infrastructure, Penguin Solutions ensures that the underlying infrastructure meets the specific demands of Deepgram’s neural networks.

The architecture also incorporates Dell PowerScale storage and Dell PowerEdge XE7745 servers with NVIDIA RTX PRO 6000 Blackwell Server Edition GPUs, which provide efficient inferencing that enables data-intensive voice applications to operate seamlessly in real-time environments.

“Deepgram is focused on delivering voice AI capabilities that meet the demanding performance, scalability, and reliability requirements of enterprise environments – something only Deepgram brings to the market today,” said Abe Pursell, vice president of partnerships and business development at Deepgram. “The infrastructure behind our platform has to be equally robust to support that level of innovation. Penguin Solutions demonstrated a deep understanding of our technical requirements, translating them into a sophisticated infrastructure environment that meets and exceeds expectations.

This enables us to continue delivering the enterprise-class capabilities our customers rely on.”

“AI-driven voice applications are transforming how organizations engage with customers and patients, but success depends on a resilient, high-performance infrastructure foundation,” said David Noy, vice president, unstructured data solutions product management at Dell Technologies. “Our collaboration with Penguin Solutions demonstrates how AI-optimized Dell PowerScale storage and Dell PowerEdge servers with NVIDIA RTX PRO 6000 Blackwell Server Edition GPUs can accelerate enterprise AI adoption at scale. Together, we’re enabling Deepgram to deliver secure, low-latency voice AI experiences that power mission-critical innovation across healthcare and retail.”

The Deepgram-Penguin Solutions-Dell collaboration comprises a comprehensive approach for enterprises looking to modernize their customer and employee experiences.

With Deepgram’s API-driven voice capabilities, Penguin Solutions’ AI services, and Dell’s powerful AI infrastructure, organizations can achieve highly accurate, real-time transcription and speech synthesis, all while maintaining strict data governance and control.

The post Deepgram Selects Penguin to Optimize AI Inference Infrastructure for Enterprise Voice AI appeared first on Tech | Business | Economy.

Deepgram Earns AWS Generative AI Competency to Strengthen Voice AI Solutions

Joan Aimuengheuwa — Fri, 03 Oct 2025 06:51:22 +0000

Deepgram has earned the Amazon Web Services (AWS) Generative AI Competency, a recognition that strengthens the company’s place among trusted partners helping organisations deploy advanced artificial intelligence solutions at scale.

The designation comes after a demanding evaluation that required Deepgram to demonstrate technical strength, verified customer success, and real-world deployments. This acknowledgement reveals that its voice technologies are powerful, secure and production-ready.

Abe Pursell, vice president of Business Development and Partnerships at Deepgram, explained the importance of the achievement. “Generative AI is one of the most transformative technologies of our time — but in order for enterprises to adopt it with confidence, they need proof it works at scale and integrates seamlessly into their existing stack. This recognition from AWS gives our customers exactly that peace of mind. It shows Deepgram’s voice AI solutions have already been tested, vetted, and proven in the real world.”

For customers, the benefit isn’t limited to trust. The partnership brings closer alignment with AWS services such as Amazon Bedrock, Amazon Connect, and Amazon SageMaker. It also enables enterprises to take advantage of AWS Marketplace access, Private Pricing Agreements (PPAs), and AWS credits, factors that can reduce costs and boost deployment.

According to Pursell, “For customers, collaborating with an AWS Generative AI Competency provider like Deepgram translates into faster time-to-value, reduced total cost of ownership (TCO), and peace of mind that their investment is future-proofed within the AWS GenAI ecosystem.”

The AWS Competency Programme is designed to help organisations identify partners with proven expertise in using AWS tools and infrastructure to build and integrate generative AI solutions.

For Deepgram, it represents an endorsement of years spent refining voice-native models capable of handling speech-to-text, text-to-speech, and speech-to-speech tasks with speed and accuracy.

With more than 200,000 developers building on its platform, and over a trillion words already transcribed, the company has built solutions highly essential in the voice AI market. Including startups and global enterprises, Deepgram’s services now stand on even stronger ground within AWS’s growing generative AI space.

The post Deepgram Earns AWS Generative AI Competency to Strengthen Voice AI Solutions appeared first on Tech | Business | Economy.

Caantin Sets Sights on Global Voice AI Market with $4 Million Raise

Joan Aimuengheuwa — Thu, 26 Jun 2025 17:12:07 +0000

Zambian startup Caantin is raising $4 million to build out its AI-powered voice infrastructure and enter new markets beyond Africa.

The company, which transitioned from a general AI solutions provider to a voice-first call centre automation platform just six months ago, is now placing itself as a key backend partner for banks, fintechs, insurers, and ISPs across the continent.

Before voice AI, there was data analytics, and before that, hospitality. Each pivot was a response to market challenges, but the voice AI play is different, revenue is climbing, and the business case is obvious.

With nearly $1 million in monthly revenue and projections of $10 million in ARR by the end of 2025, Caantin is betting that banks and financial service firms can’t afford to ignore voice-based automation.

The startup’s infrastructure now supports over one million calls per day. Clients include names like Carbon and Fairmoney. In fact, Carbon’s CEO, Chijioke Dozie, is also an investor in Caantin. And it’s not just about performance metrics; it’s about cost-cutting at scale.

Customer service, especially for loan recovery, is one of the most expensive and high-stakes operations in banking. “If these banks stop calling borrowers, they lose money,” said Njawa Mutambo, Caantin’s CEO. “But managing that operation is expensive and fragile. AI is not a nice-to-have. It’s essential for scale.”

Caantin’s pricing is aggressive. In Nigeria, it charges ₦185 per minute (about 12 cents)—which is nine times higher than local telecom operator rates. Yet, for financial institutions bleeding cash on sprawling customer service teams, the ROI more than justifies the premium.

One of its clients, Nigerian fintech Cowrywise, reportedly managed 100,000 customer calls in just three months using only one human staff, something that would normally require around 30 agents.

Caantin estimates a 933% return on investment and a 1.3-month payback period for businesses switching from human agents to voice AI.

Mutambo had previously raised $2.16 million for TopUp Mama, a procurement platform for restaurants across Kenya and Nigeria. That experience, deeply embedded in B2B logistics, local operations, and scaling infrastructure, appears to be impacting how Caantin is built.

Now, the company is planning to go beyond Africa. Its next major push is Latin America, where the cost of labour is higher, but the challenges of customer engagement are nearly identical. “In Brazil, the cost of call centre labour is around $2 per hour. In Nigeria, it’s closer to 25 cents,” Mutambo said. “So the ROI for AI is even stronger in LATAM.”

Brazil’s $262 minimum wage compared to Nigeria’s $46 only stresses the economic gap Caantin is looking to exploit. And in markets where businesses are desperate to improve margins, automation is now a means for sustainability.

Other companies like YC-backed Bland AI and Observe.AI are also building voice-first platforms. But few are adapting their systems to support African languages or integrate directly with regional fintech infrastructure like Paystack or Flutterwave. That’s where Caantin sees its edge.

“We are a telecoms business tailored to financial services. Their growth becomes our growth,” Mutambo explained. “By serving banks and fintechs, we are effectively hedged within a high-yield vertical.”

The company is developing advanced analytics to pull insights from voice data, turning it into a decision-making asset for clients. This positions Caantin as more than a call automation tool; it’s aiming to become a strategic enterprise layer across industries.

In a continent where under 1% of global AI research is produced, but where infrastructure inefficiencies are common, Caantin’s push for voice-first AI is timely and necessary. High illiteracy, low smartphone usage, and poor connectivity make voice far more inclusive than apps or chatbots.

The global call centre AI market is projected to grow from $1.6 billion in 2022 to over $7.5 billion by 2030. Caantin is angling for a piece of that, starting from Africa but with its eyes clearly fixed on bigger, more lucrative terrain.

The post Caantin Sets Sights on Global Voice AI Market with $4 Million Raise appeared first on Tech | Business | Economy.

SuperDial Raises $15M to Automate Healthcare’s Endless Admin Phone Calls

Joan Aimuengheuwa — Tue, 24 Jun 2025 14:22:47 +0000

As AI agents reshape work across industries, SuperDial is targeting one of healthcare’s most expensive and invisible burdens: administrative phone calls.

Today, the company announced $15 million in new funding to scale its voice AI platform, which automates high-friction insurance calls that cost provider organizations and billing companies billions of dollars every year.

The debt and equity series A round was led by SignalFire, with participation from Slow Ventures, BoxGroup, and Scrub Capital. It includes $3 million in venture debt for SuperDial to invest in R&D and go-to-market initiatives.

In total, the company has now raised over $20 million in funding. This also marks one of the first investments from SignalFire’s new $1 billion fund focused on applied AI.

SuperDial builds AI agents that handle outbound phone calls from providers and billing companies to insurers – navigating phone trees, waiting on hold, and conducting live conversations with payer reps.

These AI agents support tasks like benefits verification, prior authorisation, claims follow-up, and credentialing. When a call can’t be completed by an AI agent, SuperDial’s human call centre team steps in, ensuring reliable outcomes while continually improving the AI.

The platform integrates with EHRs and other systems of record to automate documentation, including writing back data gathered from calls, such as claims status updates. Customers rely on SuperDial not just to cut costs, but to unlock capacity across their revenue operations teams. Customers have reported up to 3x cost savings per call and 4x productivity gains for their existing billing teams.

SuperDial was founded by Sam Schwager and Harrison Caruthers, who met at Stanford while studying computer science. After building a healthcare billing company that spent thousands of hours on repetitive calls to payers, they saw the opportunity to automate the problem. What started as an internal tool quickly grew into a standalone solution.

“The timing is perfect for us to tackle this problem at scale, with AI capabilities quickly maturing and the healthcare sector looking for new ways to drive efficiency by leveraging next-gen technology. Our success to date, and the incredible level of interest and excitement we’re seeing from the market, are clear signs that we’re solving a real, urgent problem,” said Sam Schwager, co-founder and CEO of SuperDial.

Since launching at the end of 2023, the company has quickly scaled to seven figures in revenue and tens of thousands of calls per week.

Earlier this year, SuperDial acquired MajorBoost, a voice AI company specialised in navigating complex phone trees and insurer workflows. The acquisition deepened SuperDial’s technical team and further cemented its leadership in healthcare-specific call automation.

SuperDial’s growth comes as healthcare organisations seek to cut admin costs without expanding headcount. The $150 billion U.S. RCM market still relies on manual phone calls for basic tasks – calls that can take over an hour and pull staff away from higher-impact work.

SuperDial’s customers include RCM companies and large provider organisations – including DSOs and MSOs – that manage billing in-house. Their customers rely on SuperDial to improve financial performance, reduce burnout, and unlock their teams’ capacity to focus on higher-value work.

At West Coast Dental, SuperDial now handles over 10,000 calls per month to check claim statuses, a process that previously left nearly 70,000 claims in backlog and would have required five new hires to process. With SuperDial, the team has significantly reduced AR days and gained trustworthy, up-to-date visibility into claims.

“SuperDial isn’t just automating phone calls – they’re building the connective tissue for how the healthcare ecosystem will communicate in the future,” said Yuanling Yuan, Partner at SignalFire.

“We believe agentic AI infrastructure is inevitable, and SuperDial is leading that shift with rapidly growing traction and a team that deeply understands the problem. This is exactly the kind of applied AI we’re excited to back.”

Looking ahead, SuperDial will deepen its EHR integrations, expand to new administrative workflows, and continue training its agents using real-world call data.

Although healthcare never built the APIs to enable clean, system-to-system communication, SuperDial is building the next best thing: a network of AI agents that can navigate fragmented infrastructure on behalf of the organisations that rely on it.

SuperDial believes the future of healthcare coordination will be agent-powered – where payers, providers, pharmacies, labs, and other healthcare organisations can seamlessly communicate with one another, AI-to-AI. And SuperDial will power that future.

The post SuperDial Raises $15M to Automate Healthcare’s Endless Admin Phone Calls appeared first on Tech | Business | Economy.