OpenAI has initiated the rollout of a new advanced voice feature for ChatGPT, utilising the GPT-4o model to bring commendably realistic audio responses.
This feature, currently available in an alpha version, is being introduced to a select group of ChatGPT Plus users and is expected to expand to all Plus users by autumn 2024.
The implementation of this technology follows controversy from its initial demonstration, which included a voice that bore a striking resemblance to actress Scarlett Johansson.
Despite denying the use of Johansson’s voice, OpenAI faced issues, prompting the removal of the voice from its demo and a delay in the feature’s release to enhance safety measures.
The Advanced Voice Mode distinguishes itself from previous iterations by allowing seamless and direct processing of audio inputs, eliminating the need for intermediate text conversion.
This update comes as a more fluid and efficient interaction, capable of recognising multiple speakers and interpreting emotions in speech. The feature is designed to enhance user experience with empathetic and human-like responses.
OpenAI has introduced safety measures to address misuse, restricting the voice system to four preset options—Juniper, Breeze, Cove, and Ember—developed in collaboration with professional voice actors.
The company has implemented filters to prevent the generation of copyrighted content and impersonation of individuals, aiming to avoid the pitfalls of deepfake technology that have previously plagued the industry.
In preparation for this release, OpenAI conducted extensive testing involving over 100 external experts across 45 languages, ensuring the system’s security and efficacy.
A detailed report on these safety efforts is expected in early August.
Comments 1