Text to Speech modulating between European and Brazilian portuguese

Text2Speech API sometimes mixes the Brazilian Portuguese and the European Portuguese accents in the output audio. I couldn’t figure why, but it’s boring to get the reader voice changing its accent in the middle of the text.

Is there any way to fix this?

The OpenAI TTS is done by real voice actors, so you will get an American accent regardless.

Why does it sound like somebody from California when the language I want spoken is from someone from Louisiana or Australia? I can see why the same situation is exacerbated when it is not speaking a native language of the developers.

You can use the prompt language to give some lead-up and see if the AI will follow, such as “Now here’s Luiz, a native of Rio De Janeiro, speaking Brazilian Portuguese, with our presentation.” (or written in the destination language), and see if what follows is affected at all by the lead-up of the prompt with audio that you would then discard. You could also place [Brazilian Portuguese] in square brackets on a new line interspersed with the language, as these kind of cues are often interpreted instead of spoken by TTS-1, but I don’t have high hopes of that.

Also try tts-1-hd. It has made no difference for me in English, but might be smarter in other ways.


There is opportunity available for a localized TTS service for coverage where OpenAI alone isn’t sounding like what you want.