“You can generate spoken audio in these languages by providing the input text in the language of your choice.”
I am trying to generate Dutch spoken text. This is one of the supported languages.
I think it is brilliant how close the generated voice actually comes to the real thing, but the US Accent is clearly audible. Especially with the ‘r’ and a bit in the ‘z’ sound.
I was wondering … how often are the models updated?
Is there any chance that the multi-language support will be improved anytime soon?
Would it be an idea to add the language, or locale (for Belgium/Dutch or Canadian/French for instance) to the API Call?
Or is there a newer version that I can call? 3.5? 4.0?
There is beta testing for voice cloning in testing. So that could mean an update in the near future. It also means that people will be able to create voices to fit these cases.
well looks like the voices are the same based on message I got, so that information I had seen more than once from asking ai’s was “hallucination” which I find funny that openai models know alot about the code and other things it can look up, including what it can do, but still has moments.
going to have to start asking it for links to data sources lol.
The TTS model works great, reads correctly and with expression.
However, the obvious American accent significantly limits its usability.
For instance, if the input text is in Hebrew, I would prefer the output to have the accent of a native Hebrew speaker. Is it possible for the TTS model to generate audio without the American accent and with a clearer pronunciation of the specified language?
I dont think its really a flaw, after all the models were probably trained by NA speakers. What you see would require someone with each accent to speak and be trained on to make the models model more diverse and natural.
From api perspective you have other options outside openai.
Examples of some fantastic models that may achieve what you are looking for:
Elevenlab’s They over a lot of models and model mixing as well training of new models. Well not as cheap as openai models they are on the same level and more for realistic and emotional ranges.
You could also try some of these transformer models
This may lead you down the path of creating or fine tuning a model of your own. If you have the hardware.
I think in time openai models will continue to evolve adding more options and styles in order to maintain competitive in a booming market.