Chat gpt 4o TTS API lacking details

fel.h2o · May 19, 2024, 10:26pm

I read https://platform.openai.com/docs/guides/text-to-speech
It says it support languages, but example does not show how to use other languages…

has anyone being able to make it wok?

supershaneski · May 19, 2024, 11:36pm

It supports outputting the spoken languages listed but has no language parameter yet. So you can input texts from those languages and it should generate the corresponding audio. However, please take note that for languages other than English, it might sound as if the texts are being spoken by a foreigner.

yjianghong · May 20, 2024, 12:06am

BTW, so far no docs is saying that TTS is GPT4o. Since GPT4o supports 3 modality input and output, assuming that they’re not achieving such by simply piping text output into TTS, then TTS is not gpt4o for the obvious lack of modalities.

valehelle · May 20, 2024, 1:49am

It detect the language based on the text itself. This is a problem because they are languages that uses that same word but speak differently like Malaysian and Indonesian. Aside from that, it won’t be able to detect the language reliable if the text is short. You can try improving it’s accuracy by using something like this {language}: . The modal will only speak in <> bracket. I think you have better luck using elevenslab.

anon22939549 · May 20, 2024, 6:29am

This capability is not yet publicly released.

kjordan · May 20, 2024, 8:13am

Thanks for the clarification. Any ETA on this one?

anon22939549 · May 20, 2024, 8:33am

When they’re convinced it’s safe to be released.

That’s not necessarily “safe” as in unlikely-to-create-Skynet-safe as much as won’t-inexplicably-start-shouting-racial-epithets-safe (I’m not suggesting this is what it’s currently doing or why it’s not released, this is a completely fabricated example of something they probably don’t want the model doing.)

They’re currently red-teaming the model. No one outside of OpenAI (and probably select partners) is privy to the current status of that process.

All I know is we’ve been told the audio-to-audio capability is the top priority and they’re working hard to bring it to all of us as quickly as possible.

Topic		Replies	Views
Any plans for releasing an API for TTS? API api , tts	28	5913	November 9, 2023
GPT-4o Audio Access for API API gpt-4o	28	34027	December 13, 2024
New model, tts-2, any news on it? (new voice mode) API tts	9	2023	February 21, 2025
Speech-to-Speech (Audio Input/Output) with 4o API	5	1200	October 13, 2024
How can I get acess to the TTS models? API tts	17	3542	November 14, 2023

Chat gpt 4o TTS API lacking details

Related topics