Is it possible to specify output language in text-to-speech?

Is it possible to specify output language in the T2S API? https://api.openai.com/v1/audio/speech

The reason is because let’s say I give it numbers in roman numeral (“1, 2, 3, 4”) and I want it to output in Chinese, by default it’s going to output in English.

Or is language only auto-detected?

Welcome to the dev forum @elin44.

In the current state, the TTS models cannot read Roman numerals in Chinese. You might be able to get the models to perform somewhat better by translating/transliterating the numerals into their respective Mandarin values or pronunciations.

Another, better, and more expensive way is to use the gpt-4o-audio-preview to generate the spoken tracks for your text, as it can be prompted to read the text how you want it to with much more human-like voices.

Thank you. This is a minor problem and a corner case, I’m able to live without it for my app.