Very excited to integrate our app with voice to voice, I’m assuming OpenAI would give us devs access - wondering if anyone has any ideas if and when this is happening?
I would image there will be at least some discussion of the text to speech at DevDay on November 6th, S2S will I think be up to the developer as there are so many ways to do it.
Right but I figured OpenAI would be releasing their own voice models and API integrations for quick ease of use to compete with eleven-labs or the like. I imagine OpenAI will have the better, cheaper tech.
Eleven Labs is OK but its not something I would feel comfortable using for long form conversations as it just bugs out too much mid sentence.
Indeed, I really like the natural sound of the OAI speech models, it would be interesting to see if they are going down the trainable 11labs style route or if iut will just be fixed, I also don’t know if they are using a 3rd party solution, in which case there may be no mention of it at all.
Am I confused? Does Open AI already offer voice models? AFAIK That was coming with voice to voice integration?
Right now the only offering is voice transcription via Whisper, I have not seen anything official about a text to speech option yet or any mention of a voice to voice system, much as ChatGPT’s conversation system has to be handled in code when using the API, i.e. adding the new chat to the end of the conversation list, I think voice input and output will be two unlinked objects that developers can then put together how they wish, that is assuming that a speech output API even becomes public. It is very possible that the speech output system is for phone apps only and will not be made available.
Counting the days until DevDay so these things become clearer.
If nothin is announced on the 6th I’ll probably just figure a way to do it with a third party.
Yes, there are certainly no shortage of options, but it would be nice if OpenAI provided an API for it,
Do you have any recommendations? I really only know eleven-labs. I am less interested in customizable voices and more interested in stability of the voice over long form conversations
could take a look at the link below, but it’s worth bearing in mind that if OpenAI are building something, might be worth waiting a week to see.