Whisper with Assistant API Thread

Can Whisper API pass the transcribed text to a given thread, get the result for from given thread, and output audio optionally in different languages in one single call?

Any more takers of this idea or anyone knows a solution to this?

You’re going to have to do several API calls for this. Also, Whisper doesn’t output audio, it only transcribes audio. If you want to output audio, you’ll have to use the text to speech API.

  1. Pass your audio to the Whisper API and get the transcribed output
  2. Pass that output to a message to your assistant
  3. Check assistant run status until it says “completed”
  4. Retrieve the latest message(s)
  5. Pass the message(s) to the TTS API so it can be read out loud.

It will ultimately have some latency, especially because the Assistant API isn’t as fast as the Chat Completion API, but it’s definitely feasible.

1 Like

Thanks @turbolucius i am doing that but as you pointed out latency is the killer right now, and it’s not just due to whisper and tts but some other practical reasons too. I am hoping to cut down on whisper and tts interaction latency with threads at least if possible.