If you haven’t heard the news about chatGPT going multimodal, check out this forum post
We’re rolling out voice and images in ChatGPT to Plus and Enterprise users over the next two weeks. Voice is coming on iOS and Android (opt-in in your settings) and images will be available on all platforms.
With that comes 5 new voices created by a new text to speech model, the only question that remains is, which voice is your favorite?
Juniper
Sky
Cove
Ember
Breeze
0voters
If you haven’t heard the voices yet, you’ll find a selection of samples here
API access currently does not exist but OpenAI released this:
We’re rolling out voice and images in ChatGPT to Plus and Enterprise users over the next two weeks. Voice is coming on iOS and Android (opt-in in your settings) and images will be available on all platforms.
Welcome to the developer community forum, you’ll have to wait a few weeks:
Plus and Enterprise users will get to experience voice and images in the next two weeks. We’re excited to roll out these capabilities to other groups of users, including developers, soon after.
I’ll need to listen to more samples and play with it a bit myself once it rolls out, but to my ear it didn’t matter how pleasant a voice’s qualities are, if the intonation is off it pulls me out of it.
I understand their concern for impersonating important people. However, they do not need to release that (i.e. capability of crafting realistic synthetic voices from just a few seconds of real speech). Just give us a proper voice API, please! It will complete the circle: Whisper->Chat->Voice.
Currently using the API i can only send Text to the Chat-Endpoint, also there is no other documentation on the API-Site, please make sure this will be done also so i can add the new options to the Smart Package Robot using the API.
Currently I am using ElevenLabs for Speaking, of course it would also be welcome if we get a Text_to_MP3 API.
ChatGPT should have the voice of a person new to the world but with extensive knowledge. So a cross between a kid and a professor.
So a “whiz kid” of some sort.
But maybe thats just my personal lense.
EDIT:
The last thing we need to hear is the voice of someone we like making occasional errors or having a buffer overflow or glitch. But an honest kid can be easily forgiven.
I think there are some pretty solid reasons why we shouldn’t expect to hear a kid’s voice coming from OpenAI anytime soon.
I think there are absolutely perfectly reasonable use cases for such a voice, and they will inevitably be a thing, but I don’t think the company with the world’s biggest target on its back is going to be anywhere near the first-mover in that space.
It’s unfair that OpenAI should have issue with anyone. Everyone knows they’re doing the best job they can. Artificial Intelligence is inevitable, and “someone” had to break the news to the general public. Otherwise only scientists and special interests would have and be able to use the technology. I think what they’ve done is very democratic and being the beneficiaries, we should each declare our endorsement of their company and defend it against any undeserved negative comments - however indirect.
Not sure I follow, do you mean like an accent? I really can’t tell what colour someone’s skin is on the telephone, ok, sure if they have a particular accent like West Indian, or some of the African dialects have heavy accents, but are you asking for a voice that is black sounding? How would one go about that?