Hi everyone,
I hope you’re all doing well!
Over the past month, I’ve been building a Flutter app that integrates Whisper and a GPT assistant. The idea is for users to speak to the app, have their voice transcribed by Whisper, and receive spoken responses from GPT. However, I’m encountering significant issues with performance:
-
Delay: There’s a 4–5 second lag between sending the transcription to GPT and receiving a response. While text input to GPT responds almost instantly, voice-to-voice interactions are frustratingly slow.
-
Reliability: The feature works only about 20% of the time. Even for short audio recordings, the response fails to come through most of the time.
I’m wondering:
- Is this a limitation of the GPT API itself?
- Could I optimize my implementation to reduce latency?
- Should I consider switching to another AI model?
The 4–5 second delay is a deal-breaker for my app’s user experience. Any advice or guidance would be greatly appreciated!
Thank you!
Eric