How do I get a wider range of emotion out of tts like shown in the demo today?

I have used OpenAI’s tts API. The output produced is good but not like what was demoed today. Is the API behind the demoed chatgpt update coming to the public?

1 Like

Is the audio output shown in the demos today from gpt4o directly or is there some model doing text to speech?

It is directly gpt-4o, but that capability of the model is still being red-teamed.

As of today, the gpt-4o model is a text + vision model only.

3 Likes