Hi, the new model “gpt-4o-realtime-preview-2024-12-17” has a problem with switching modalities. It worked well with the previous version and it is working well with the new mini model too, only the mentioned “gpt-4o-realtime-preview-2024-12-17” has this problem.
- I start the session with modalities set to [“text”] only.
- Send a text user message to the model, model replies with text only, that is correct
- Update the session with modalities set to [“text”, “audio”]
- Now when I talk, the model still responds with text only, with no audio data
Sometimes it starts responding with audio later in the same session suddenly. But sometimes it is like after two minutes, sometimes five, sometimes never.