I have a set of trained data, assistants, etc. working brilliantly on Gpt-4o
And, for my usecases, the responses from GPT-4o beat llama3, Google, etc. hollow!
Now, i would like to extend this to speech to speech. I can wait for GPT-4o voice to be made public on the API but this could be a long wait (any ideas anyone?)
While i wait for the above, the only thing I can think of is extend my implementation to do text to speech & vice-versa. But then, of course, i will be hit by latency as currently the text responses are streaming !
Does anyone have any better ideas on this? Or, when is GPT-4o voice likely to be made available publicly?
Thanks!