Hi OpenAI team,
I would like to file a feature request regarding Voice Mode / Advanced Voice Mode in Custom GPTs that depend on Actions / function calls.
There are already several community reports about this issue. Earlier discussions first asked for Advanced Voice Mode support in Custom GPTs; later comments reported that Advanced Voice Mode became available, but Actions are skipped or not executed, while the same Custom GPT works correctly in text mode. There is also a newer thread specifically about function calling not working in Advanced Voice Mode. (OpenAI Developer Community)
My request is slightly different from “please make Voice Mode support Actions,” although that would of course be ideal.
The immediate product problem is this:
Voice Mode is still offered to users even when the Custom GPT cannot fulfill its core purpose without Actions.
For many Custom GPTs, Actions are not optional enhancements. They are the core integration layer. In my case, the GPT depends on backend calls for learner state, curriculum selection, progression logic, and next-step computation. In text chat, this works as expected. In Voice Mode, however, the required backend calls are not executed, so the user enters a degraded experience without understanding why.
This creates a very poor UX:
-
the user assumes they are using the same Custom GPT experience,
-
the GPT may claim or imply it can proceed,
-
but the required backend logic is unavailable,
-
and the creator has no clean way to prevent this broken flow.
Feature request:
Please provide at least one of the following options:
-
Automatically disable Voice Mode for Custom GPTs where configured Actions are not supported in Voice Mode.
-
Provide a builder-controlled setting to disable Voice Mode explicitly, for example:
disable_voice_mode: trueor:
voice_mode: disabled -
Show a clear warning before entering Voice Mode, for example:
“This GPT uses Actions that are not available in Voice Mode. Some functionality may not work.”
Expected behavior:
If a Custom GPT depends on Actions, users should not be silently routed into a mode where those Actions are unavailable. Either Voice Mode should support the Actions, or the builder should be able to disable Voice Mode for that GPT.
Why this matters:
For education, workflow automation, enterprise assistants, and domain-specific copilots, the backend is often the source of truth. If Voice Mode bypasses that backend, the GPT may become misleading rather than merely limited.
This is especially painful on mobile, where Voice Mode is highly visible and attractive to users. The current behavior creates avoidable confusion and loss of trust.
A creator-side flag to disable Voice Mode would already be a major improvement while full Voice + Actions support is still unavailable or unreliable.
Thank you for considering this.