I’m using the Realtime API to resume a conversation, emitting conversation.item.create
events for each prior message (all text only). But the model often responds with only text
, even though the session modality is text
and audio
. How can I force an audio
response?
Have you tried explicitly asking for audio response on every prompt you send?
No. So far I’m doing what others have suggested, and summarizing the previous conversation into the system instructions.
You can try this!
Cheers.