I’ve already tried all the available voices (like alloy, verse, ember, coral, etc.).
However, none of them really match the style I’m looking for
Is there any way to create or fine-tune a custom TTS voice to achieve that kind of tone?
Or could I adjust the current voices (for example, pitch, speaking rate, or emotional expression) beyond what the instructions field allows?
The OpenAI AI models are going to give you a significant American foreigner accent.
You also would not “teach” Japanese in a soft and cute voice, unless you are teaching little girls or diminutive train announcers, or men who desire to sound feminine. You should mirror the user as speaker, by gender, by social position vs who they are interacting with…or don’t even start with less than minimum politeness.
For what you are requesting, “sage” seems the most adaptive. The AI model works better with instructions in English (here answering about modern girl names for the same application, but giving loan names like “Karen” or “Erika”).
(if you want humor, use a Japanese Windows 11 Narrator voice, and have it speak English text - sounds like the worst book-learner)
I just launched the first version of Mika, my personal Japanese learning assistant. It gives you a Japanese word and a grammar point, lets you write a sentence, and then checks it while giving helpful feedback.
Mika is already online, but I noticed that I can’t include a link to it in this forum post. Is there any way to share a link so people can try it out?