RFC: Push-to-Talk (PTT) Button for better voice conversation control & experience

Dear OpenAI Team,

I’d like to propose adding a push-to-talk (PTT) feature to the ChatGPT Android app. This feature would allow users to control when the app listens by pressing and holding a button, with ChatGPT beginning its response only after the button is released.

Problem with Current Implementation

The current voice conversation mode often behaves unpredictably:

  • The app frequently fails to listen when expected, or does so only after a waiting time.
  • It sometimes picks up unintended input from background noise, interrupting its output.
  • In voice conversation mode, the UI prominently shows those (pretty abstract btw.) large black/blue discs or animated bubbles. However, these elements are purely passive visual icons, which lead to user confusion. It is likely that many users instinctively tap on these spheres, expecting them to trigger some functionality—such as activating listening mode—but are met with no response.
    These issues can make the experience frustrating, especially in noisy environments or during precise interactions.

Benefits of PTT Mode

  1. Precise Control: Ensures listening and responding happen only when explicitly intended.
  2. Better Performance in Noisy Environments: Minimizes interruptions caused by background noise.
  3. Improved Clarity: Removes ambiguity about the app’s status (e.g., “Is it listening?”), without having to constantly look at the display.
  4. Resource Efficiency: Reduces active listening time, reducing advanced voice conversation quota usage (and battery usage as well).
  5. Enhanced Privacy: The app listens only when the user presses the button, respecting user needs.
  6. User-Friendly Options: The feature could be activated through a toggle in the settings, to switch between a PTT mode and the current continuous listening mode.

Thank you for considering this improvement, which I believe will enhance the app’s usability and reliability for all users.

Best regards,
Kristian Hasenjäger

4 Likes

The problem is that the developers need to write their apps to the vendor APIs and take advantage of lower level OS capabilities. We have to let them know this is important. Android app is inferior. Needs to be written to hardware device

Absolutely important feature for all platforms of of OpenAI (Web, Desktop Application, Mobile Apps …).
There used to be a microphone icon where you could transcribe a prompt without constantly being interrupted while thinking => we want it back!!
Now there seems only to be interactive voice control, which is a very bad experience when you are trying to enter a complex prompt or you are talking while thinking through a complex context. The interruptions are very distracting and annoying if you are not just doing casual talking.

Please get back the microphone icon to allow users to have full control over when the AI should listen or process. I think this should also have the positive effect of saving tons of tokens and energy while allowing for a way better experience for more complex prompting.
Currently, I am using external tools for transcription and pasting it over.
Thanks.