OpenAI’s GPT-4o model emulates the user’s voice in a noisy background because it gets confused, but the issue has been mitigated at a “system level”

OpenAI has trained its GPT-4o model to decline requests to generated copyrighted content, including audio.

When you purchase through links on our site, we may earn an affiliate commission.Here’s how it works.

What you need to know

OpenAI’sGPT-4o’s launch in Maycontributed to thebiggest spike ever in ChatGPT’s revenue and downloadson mobile and continues to perform well with $28 million in revenue in July. These figures might get better, especially after the ChatGPT makerlaunched the long-awaited Advanced Voice Mode feature.

At launch, OpenAI indicated it delayed the feature’s launch by one month to ensure it meets the set threshold and security standards. It’s worth noting that the feature’s accessibility is currently limited to select ChatGPT users and buried behindthe $20 Plus subscription. OpenAI says the feature’s limitation to a small group of users is designed to help the company gather feedback and expand its capabilities.

The ChatGPT maker recently publisheda new blog posthighlighting observed safety challenges facing its Advanced Voice Mode and the elaborate measures it’s taking to mitigate the issues. Unauthorized voice generation using Advanced Voice Mode is a major concern for OpenAI. The company says the model is restricted to “pre-selected voices.” It will also leverage an output classifier to detect when the model veers off the rails.

There are issues but OpenAI is working on them

OpenAIadmits GPT-4o may fall off the rails and do things it’s not supposed to. For instance, the company says the model emulates a user’s voice when in a noisy environment. It further indicates this odd occurrence happens because the model struggles to understand the prompt due to the background noise.

It’s worth noting that this issue no longer riddles the model. While speaking toTechCrunch, an OpenAI spokesman indicated that the company has since added a “system-level mitigation” to GPT-4o to prevent the reoccurrence of the annoying issue.

Another prevalent issue is speaker identification, which draws back the line to thesafety and privacy issues around AI. OpenAI says that the model has been trained to decline requests to identify someone based on a voice in an audio output. However, it can identify people associated with famous quotes.

OpenAI and Microsoft have been under fire multiple times over the past few years forcopyright infringement. Microsoft Copilot andChatGPThave been spotted stealing content from publications without compensation or attribution.

All the latest news, reviews, and guides for Windows and Xbox diehards.

The same issues were also identified in GPT-4o. OpenAI says the model is now trained to decline requests for copyrighted content across audio and more. According to OpenAI:

“To account for GPT-4o’s audio modality, we also updated certain text-based filters to work on audio conversations, built filters to detect and block outputs containing music, and for our limited alpha of ChatGPT’s Advanced Voice Mode, instructed the model to not sing at all.”

Safety is seemingly becoming a core priorityfor companies like OpenAI and Microsoft. It’s interesting to see them address critical issues impacting flagship AI models before shipping them to broad availability, which could lead to major privacy and safety issues.

Kevin Okemwa is a seasoned tech journalist based in Nairobi, Kenya with lots of experience covering the latest trends and developments in the industry at Windows Central. With a passion for innovation and a keen eye for detail, he has written for leading publications such as OnMSFT, MakeUseOf, and Windows Report, providing insightful analysis and breaking news on everything revolving around the Microsoft ecosystem. You’ll also catch him occasionally contributing at iMore about Apple and AI. While AFK and not busy following the ever-emerging trends in tech, you can find him exploring the world or listening to music.

OpenAI’s GPT-4o model emulates the user’s voice in a noisy background because it gets confused, but the issue has been mitigated at a “system level”#

What you need to know#

There are issues but OpenAI is working on them#

Get the Windows Central Newsletter#

🔥The hottest trending deals🔥#

OpenAI’s GPT-4o model emulates the user’s voice in a noisy background because it gets confused, but the issue has been mitigated at a “system level”

What you need to know

There are issues but OpenAI is working on them

Get the Windows Central Newsletter

🔥The hottest trending deals🔥