voice assistants Archives - Tech | Business | Economy

OpenAI Launches New Real-Time Voice Models for Translation, Live Conversations

Joan Aimuengheuwa — Fri, 08 May 2026 08:30:41 +0000

OpenAI has launched three new voice models for developers, expanding its real-time audio tools that can speak, translate and transcribe conversations as they happen.

The company said the new tools are designed to make voice-based apps more useful in everyday situations, especially where users need software to respond naturally while carrying out tasks in real time.

At the centre of the launch is GPT-Realtime-2, a voice model OpenAI says can handle more difficult requests while keeping conversations flowing naturally.

Unlike earlier versions, the company said the model uses GPT-5-level reasoning to manage interruptions, understand context better and carry out actions during conversations.

OpenAI also unveiled GPT-Realtime-Translate, a live translation model that can translate speech from more than 70 input languages into 13 output languages.

According to the company, the system keeps pace with the speaker during conversations instead of translating after pauses or completed sentences.

The third model, GPT-Realtime-Whisper, focuses on live speech transcription. It converts spoken words into text instantly while a person is talking.

“Together, the models we are launching move real-time audio from simple call-and-response toward voice interfaces that can actually do work: listen, reason, translate, transcribe, and take action as a conversation unfolds,” OpenAI said.

Voice products have become a huge focus for technology companies as more users interact with software through speech instead of typing. OpenAI said developers want systems that can manage tasks while conversations continue naturally.

The company pointed to customer support, travel, education, media and creator platforms as some of the areas expected to benefit from the new models.

OpenAI also described three growing patterns it sees in voice-based software.

The first is “voice-to-action”, where users speak naturally and the system completes tasks on their behalf. OpenAI said property platform Zillow is building an assistant that can help users search for homes, avoid certain neighbourhood conditions and book tours through voice requests.

Another pattern is “systems-to-voice”, where software provides spoken updates automatically. OpenAI gave the example of travel apps that could alert passengers about delayed flights, new boarding gates or transfer routes without users typing commands.

The third area is “voice-to-voice”, which focuses on live multilingual conversations. OpenAI said Deutsche Telekom is developing customer support systems that translate discussions instantly while both sides continue speaking in their preferred languages.

Travel company Priceline is also working on voice-based trip management tools, according to OpenAI. Travellers could eventually book flights, change hotel reservations and receive airport updates entirely through conversation.

Alongside the broader rollout, OpenAI added several new features to GPT-Realtime-2 aimed at improving live interactions.

Developers can now enable short phrases such as “let me check that” or “one moment while I look into it” before the system completes a request. OpenAI said this gives users clearer feedback while the model processes tasks in the background.

The model can also call multiple tools at once and explain those actions aloud during conversations. OpenAI said the system may say things like “checking your calendar” or “looking that up now” while working through requests.

The company added that GPT-Realtime-2 recovers better from errors or failed requests instead of stopping conversations abruptly. It also supports a larger context window, increasing from 32K to 128K, allowing longer and more detailed conversations.

OpenAI further noted that the model has improved understanding of specialised terms, including healthcare vocabulary and proper nouns. Developers can also adjust how much reasoning power the model uses depending on the complexity of a request.

According to benchmark figures released by the company, GPT-Realtime-2 achieved higher scores than GPT-Realtime-1.5 on audio intelligence and instruction-following tests.

All three models are available through OpenAI’s Realtime API. The company said GPT-Realtime-Translate and GPT-Realtime-Whisper will be billed by the minute, while GPT-Realtime-2 pricing depends on token usage.

OpenAI said it has added safeguards to reduce misuse, including protections against spam, fraud and harmful content. The company also added that conversations can be stopped automatically if they break its safety rules.

The post OpenAI Launches New Real-Time Voice Models for Translation, Live Conversations appeared first on Tech | Business | Economy.

Nigeria’s Intron Launches Sahara v2 Voice AI Supporting 24 African Languages, 500 Accents

Joan Aimuengheuwa — Thu, 05 Mar 2026 16:44:36 +0000

Nigerian technology company, Intron, has launched a new voice recognition model designed to better understand African languages and accents, after years of complaints that global voice AI assistants usually misinterpret local speech.

The model, called Sahara v2, supports 24 African languages and recognises more than 500 African English accents. The company said the system was trained using more than 14 million audio clips collected from over 40,000 speakers across Africa and the diaspora.

For many users on the continent, voice technology usually has challenges with everyday phrases and names. Common expressions can be misheard or completely distorted, making digital assistants unreliable for basic tasks.

Developers say the problem lies in how most global systems were built. Many were trained mainly on Western speech patterns and do not align with the tonal nature, accent variety and frequent language mixing common across African countries.

With Sahara v2, Intron says it wants to close that gap by building technology that listens to how people actually speak. The recordings used to train the system were gathered across environments, including clinics, courtrooms, call centres, streets and offices.

The new model covers languages such as Hausa, Swahili, Yoruba, Igbo, Zulu, Twi, Kinyarwanda and Xhosa. In total, Intron says its systems now support 57 languages.

One of the additions is a bilingual speech recognition system that switches between English and Swahili. Intron developed the model with Kenya-based health provider Penda Health to better match how people naturally move between both languages in conversation.

The company also released a Hausa text-to-speech system designed to power local language voice assistants that can run continuously for services such as customer support.

Intron said the new system can also operate offline, allowing organisations to run voice tools locally where privacy or data security is a concern.

According to the company, Sahara v2 performs better on African speech compared with several widely used global models. These include systems developed by Google, OpenAI, Amazon Web Services and Microsoft.

Testing carried out by the company showed stronger accuracy when recognising African names, locations, numbers and sector-specific terms used in areas such as finance, healthcare and telecommunications.

Several organisations have already begun using the system in their services. These include voice banking platforms, medical documentation tools, courtroom transcription systems and automated call centre software.

Ayo Oluleye, head of Data and Insights at ARM Investments, said the model improved the accuracy of automated transcription.

“Using Intron AI models, we’ve seen significant improvement in transcription and summaries compared to models we previously explored. Their systems capture context and nuance better, leading to more accurate results.”

Sarah Morris, chief product officer at Audere, said the system also performed well during testing. “In our testing, accuracy was excellent on several Southern African accents and APIs were robust with 99%+ success rates.”

Alongside the launch, Intron also released its first Africa Voice AI report for 2026, examining how voice technology is being developed and used across the continent.

The report aims to guide governments, businesses, investors and researchers working to expand digital services that rely on speech technology.

Tobi Olatunji, chief executive of Intron, noted that the project shows what happens when technology is designed with local languages in mind.

“Sahara v2 proves that when technology is built with deep cultural and linguistic understanding, amazing things can happen, and we’re just getting started.”

The post Nigeria’s Intron Launches Sahara v2 Voice AI Supporting 24 African Languages, 500 Accents appeared first on Tech | Business | Economy.