Hi there! I’m working on developing an application that uses OpenAI’s models for speech-to-speech consultations. I have a question about the Real-time API: is it possible to fine-tune the Real-time API on our own data to generate native audio in a speech-to-speech format, similar to the latest audio features provided by OpenAI? Is this feature available for customization with our own data? Thanks for any insights!
2 Likes
Same question. Is there some way to use fine-tuned models for the real-time API?
2 Likes
+1 – Is fine-tuning the Realtime model currently possible or in the near product roadmap?
Would appreciate someone’s reply on this because my app relies on fine-tuning (text to text/audio), and I can’t migrate to the Realtime API without it.
I mainly want to train the LLM to respond in a certain way (logic, function calling) and I’m not looking for speech-to-speech tuning.
Thanks!
1 Like