OpenAI’s new voice assistant can clone any voice from a 15-second sample. But it won’t be available to the public: here’s why.
OpenAI, the creator of ChatGPT, is entering the voice assistant market with an advanced technology that can replicate a person’s voice.
However, the company has decided not to make this technology widely available to the public due to safety concerns.
This new tool, Voice Engine, can copy someone’s voice from just a 15-second audio sample. Despite its potential, OpenAI is exercising caution to prevent misuse of this powerful capability.
The Voice Engine is a big step forward in text-to-speech (TTS) systems. Traditional voice technologies often sound unnatural and lack emotion. However, with just a small audio sample, Voice Engine can create speech that doesn’t just sound real. It also feels like the original speaker in terms of emotion.
Developed in late 2022, it has already improved OpenAI’s other products, including ChatGPT Voice and Read Aloud. Now, it makes them sound more like human speech.
Despite its potential, OpenAI is moving forward carefully. They’re holding back on a wider release due to concerns about how it might be misused, like impersonating people without their permission. OpenAI is talking to various groups to find the best way to release this technology safely. The company ensures that it does more good than harm.
OpenAI’s Voice Engine and Ethical Challenges
OpenAI has let us in on some early uses of Voice Engine, showing how it could be helpful in many ways.
It can assist those who have trouble reading by providing more natural-sounding voices and help content creators reach a worldwide audience in their voice but in different languages.
It’s also valuable for providing services in hard-to-reach places and giving a new way for people who cannot speak to communicate.
Yet, this technology poses ethical questions, especially about privacy and the risk of fake impersonations. OpenAI is aware of these issues and has set rules for its partners. These rules include not pretending to be someone else without their consent and being clear when voices made by AI are used.
OpenAI is also looking into ways to track where AI-made audio is used and how to prevent its misuse.
Ultimately, it is clear that it’s essential to take certain actions:
- We must avoid using voice recognition to get into bank accounts and access other private data.
- We must look into rules to safeguard people’s voices when AI is used.
- People must be taught what AI can and can’t do, including how it might deceive them.
- The creation and use of methods to identify where videos and sounds come from must be sped up to ensure that it’s clear whether people are talking to a real person or an AI.
Voice Generation Market
OpenAI’s development of a Voice Engine is an exciting look at what’s next in technology that can mimic human speech. As we enter a new phase where AI can closely copy human voices, it’s vital to consider how to manage the risks.
OpenAI’s careful exploration and dialogue approach is a good example of balancing innovation with ethical concerns.
As we look ahead, data published by Market US about the AI Voice Generator Market underlines its rapid growth. The market, valued at $1.4 billion in 2023, is predicted to soar to $4.9 billion by 2032. This growth, at a Compound Annual Growth Rate (CAGR) of 15% from 2023 to 2032, highlights the increasing use of AI voice generators in various areas.
These technologies are becoming crucial for creating realistic voices in movies and games and enhancing customer service with a personal touch.
Regionally, North America is at the forefront of the AI Voice Generator Market, with the Asia Pacific region expected to grow thanks to advancements in 5G technology and major companies like Baidu, Inc., Alibaba Cloud, and Huawei.