I’m using ChatGPT API + Whisper ( Telegram: Contact @marcbot ) to transcribe a user’s request and send that to ChatGPT for a response. Being able to interact through voice is quite a magical experience.
I also use speech synthesis to turn ChatGPT’s response back into voice. For this I’d like to know which language the user is speaking, as that’s likely the language ChatGPT’s output is in.
Whisper does a great job transcribing many languages. It would be great if the API response would also include the language it identified. I assume this is something the model is aware of? Having this in the response would allow me to choose the right speech synthesis model.