GPT-4o voice mode successfully used in live call with telecom AI bot

Hi everyone,

I wanted to share a successful real-world experiment I ran using GPT-4o’s voice mode (the new “wave” icon variant) in a live phone call with my telecom provider’s AI bot (O2 Czech Republic). I had ChatGPT running on one phone speaking out loud while the other phone had O2’s voice system on speaker.

Surprisingly, ChatGPT’s voice output (even though framed as suggestions to me) was understood and processed by the O2 bot. It eventually forwarded me to a human without me saying a single word. GPT-4o basically held a machine-to-machine voice interaction.

Notably:

  • ChatGPT didn’t recognize it was in voice mode (or at least not the outward-directed kind).
  • Instead of responding directly to the AI bot, it spoke to me (“Try saying complaint”).
  • Nonetheless, the O2 AI bot interpreted the assistant’s voice output as if it was a real caller.

This is a promising use case for “external voice agent mode”, where ChatGPT doesn’t just talk to the user, but as the user — ideal for navigating voice menus, phone trees, or smart assistants.

:page_facing_up: Full feedback write-up (DOCX):
https://docs.google.com/document/d/118hmtEuW2FRIy2JMTwurcZp6sx7zjD3-/edit?usp=sharing

(Copy and paste the link into your browser)

Would love to hear what others think — and whether OpenAI might consider supporting an intentional “external speaker mode” for interactions like this!