Enhancement Suggestion: Live Text Display in Voice Mode for Better UX

tiagocarvalho · February 17, 2025, 4:53pm

Hey OpenAI Team and Community,

First of all, huge congratulations on the incredible progress with ChatGPT! The voice mode is truly impressive, making interactions feel more natural and engaging. However, after using it extensively, I noticed an area that could greatly enhance the user experience.

Current Limitation in Voice Mode

At the moment, when ChatGPT is in voice mode, users do not see the text of the AI’s response while it is being spoken. This makes it difficult to:
Follow longer or more complex responses.
Comprehend information clearly in noisy environments.
Retain key points without replaying the response.
Skim through or reference the response after the interaction ends.

Proposed Improvement: Live Text Display

To address this, I suggest modifying the UI for voice interactions to provide a more intuitive and accessible experience:

Minimized UI – Instead of keeping the ChatGPT interface fully visible, shrink it to a small indicator in the corner that shows when the AI is listening, processing, or speaking.
Live Text Display – While ChatGPT is speaking, display the response as text in real-time. This would allow users to read along, skim, and better absorb information without needing to rely solely on audio.
Toggle Option for Flexibility – Some users might prefer to keep the current experience, so having a toggle in settings to enable or disable live text display would ensure flexibility for different preferences.

Why This Would Be a Game-Changer

Improved Comprehension – Seeing the text while hearing the response helps with clarity and retention.
Accessibility & Inclusivity – Beneficial for users with hearing difficulties or those who process information better visually.
Enhanced Multi-Modal Experience – A seamless combination of text and speech makes interactions smoother.
More User Control – Allows users to scan responses without needing to replay audio.

Community Thoughts?

Would love to hear what others think! If you also feel this would improve voice mode, let’s give it visibility so OpenAI can prioritize it. If anyone has additional ideas to refine this concept, feel free to share!

OpenAI Team: If this aligns with your roadmap, I’d be happy to test it in beta or provide further feedback. Thanks for considering!

Enhancement Suggestion: Live Text Display in Voice Mode for Better UX

Current Limitation in Voice Mode

Proposed Improvement: Live Text Display

Why This Would Be a Game-Changer

Community Thoughts?

Related topics