Hello everyone, I have successfully translated an English audio file to Chinese using Whisper and GPT-3.5-Turbo. However, I am unsure how to achieve real-time English-to-Chinese or Chinese-to-English translation when using a microphone. Can anyone advise me on how to accomplish this?
Something like this came to mind: 1- Store the sound data received by the microphone with PyAudio somewhere 2- Send real-time received data to the model with the web socket get the answer, and use it.
However, recently, the OpenAI APIs have been experiencing latency and connection errors due to the intensity. This can negatively affect your process
1 Like