Hello,
I implemented Azure OpenAI RealTime API with the voice of Shimmer, Alloy, Marilyn Breeze, et Munch to narrate a small adventure in a comics.
Feedbacks :
- OpenAI voice are great but still have english accent in French. It’s hard to find wich voice would best match.
- Being Voice Acting Director is GREAT I’m amazed by all the possiiblities and acting style.
- Would love a bookmark and viseme feature like Azure TTS SSML.
- Would love be able to record my custom voice.
It’s a mix of technologies involving Midjourney, KingAI, Suno, and more …
2 Likes
Super cool. Love the combination.
What made you go with the RealTime API vs a TTS model w/ pronunciations (IPA etc).
I imagine soon we will have these types of comics that “self-fill” the next panel based on user decisions
What made you go with the RealTime API vs a TTS model w/ pronunciations (IPA etc).
(I call OpenAI from Azure)
I tested the ChatCompletion API but the voice results was not good compared to RealTime API through WebSocket.
So I assume it was not the same engine (that would explain the price shift to x10) and so I call RealTime API instead.
But I might be wrong ? I’m curious if there is an API or something more simple to call the new Voice Model with instruction + prompt