[Realtime API] gpt-4o-realtime-preview models audio downtime (no output) on Jun 2

Today I faced a intermittent issues where the gpt-4o-realtime-review models were not outputting any audio even though the session was instantiated correctly and the transcripts from the text model were being streamed out correctly. No error were shown in the logs neither in my application nor the playground, everything seemed to be working fine except no audio was being output.

The issue was also happening in the OpenAI playground directly. Not sure when it started but it stopped working for me for at least 1 hour.

I switched to gpt-4o-mini-realtime-preview models and it was working fine in both my application and the playground.

I’m relatively new in integrating this technology, I know it is in beta but wondering if anyone else has seen this behavior before and how frequent these incidents are.

I’ve been monitoring the status page but no incident has been reported. I also reported this through the Support chatbot (although the chatbot said the engineering team was notified not sure if this is true and whether they will post an update).

2 Likes

We encountered the same issue at the exact same time. The status page showed everything as green. Unfortunately, it happened during a live demonstration to investors.

We’re having issues with the latest realtime model version: the output text/audio is getting cut off at the end and not returning completely.

1 Like

I have the exact same issue also, this only happens on the new model, it starts speaking then towards the end stops, and then finishes off. This doesn’t happen on the previous models, I’m using web sockets.

You need to prime the model.

Let’s say, then, that I need to prime the model but don’t understand what you are talking about.

Therefore: What are you talking about?

1 Like

The issue I encountered was related to an intermittent downtime but seems like it is not a frequent issue. I’ll try to build something to detect if no audio is being output but I guess it is going to be hard since no errors were being given by the API. If anyone has some thoughts on this I’ll appreciate it

While I’ve also experienced the audio cut-off that is a separate issue, but here are some suggestions on how I’ve been able to overcome it:

  • Most of it can be fixed with prompt engineering (that’s what @Code_Whisperer refers to with “prime the model”).
  • With prompting + function calling we were able to fix most of it (although we still get it from time to time but is not as bad as before). The function call is to force the conversation to end, in the prompt you basically set some conversation guardrails and guidelines as to when a conversation should end depending on your use case, you need to force the model to end the conversation through function calling after a “last input” from the user without replying back.

But again, the purpose of this post was to discuss audio downtime would like to keep the discussion to that topic. If anyone has faced it before or if anyone faced it on June 2nd I would appreciate if you can share your experiences and what you observed.