Voice Input on ChatGPT App Needs Reliability Improvements
Hi OpenAI team and community,
I’m a frequent user of the ChatGPT mobile app and I genuinely appreciate the voice input feature. Being able to press the microphone button and speak freely—sometimes for up to 10 minutes—without typing is incredibly convenient. The transcription accuracy is impressively high, and it has made ChatGPT significantly more accessible and enjoyable for me.
That said, I’d like to raise several serious usability concerns that have caused repeated frustration, especially for long-form voice input users like myself:
⸻
- Unrecoverable Data Loss After Long Voice Inputs
The most painful issue is that voice recordings sometimes disappear after submission. I can speak for 10 minutes straight, hit the “chat” button, and then wait… only to receive nothing in return. No transcription, no error message, just silence. This seems to occur 5–10% of the time. When it does, everything I said is lost permanently.
Suggestions:
• Please implement a temporary audio buffer/cache, so we can retry if transcription fails.
• At minimum, alert users immediately if the app failed to start recording at the beginning.
⸻
- Unreliable Voice Capture Indicator
The visual feedback during recording is misleading. Often, the waveform animation freezes or displays static dots, which creates confusion about whether audio is being captured. In some cases, audio is still recorded despite no visual feedback—other times, it’s not.
Suggestions:
• Improve the reliability of the waveform indicator.
• Add a clear “recording started” confirmation or an alert when recording fails to initiate.
⸻
- Slow, Asynchronous Transcription (vs. Real-Time)
Unlike other AI apps offering real-time transcription, ChatGPT’s voice input system captures the entire audio file first, then processes it after submission. While not a deal-breaker, this leads to slower feedback and less transparency—especially during long monologues where early errors can’t be caught and corrected.
Suggestions:
• Consider supporting partial or real-time transcription.
• At least show a live indicator that the app is actively listening and recording.
⸻
- Lack of Error Feedback or Retry Options
Currently, if anything goes wrong (e.g., connection issue, recording bug, transcription failure), the app offers no retry mechanism, no saved audio, and no explanation. This makes the user experience feel brittle, especially when using ChatGPT for important voice journaling, note-taking, or ideation.
Suggestions:
• Provide clear error messages and retry options.
• Let users access or save raw audio if transcription fails.
⸻
Final Thoughts
Again, I’m thankful for the voice input feature—it’s a brilliant addition and works wonderfully most of the time. But these edge cases create disproportionately negative experiences. Losing 10 minutes of spoken thoughts without warning or recourse feels unacceptable for a premium app in 2025.
I hope this feedback is useful for your development roadmap, and I’d love to know if others in the community have experienced similar issues—or if there are known workarounds.
Thanks for listening!