Trouble Loading Previous Messages with Realtime API

ibiscp · December 10, 2024, 11:59pm

Hi everyone,

I’m having trouble loading previous messages into the Realtime API. Has anyone successfully managed to do this?

Here’s the sequence of events I’m sending:

{
  "type": "session.update",
  "session": {
    "modalities": ["text", "audio"],
    "instructions": "Assist the user.",
    "voice": "ash",
    "input_audio_format": "pcm16",
    "output_audio_format": "pcm16",
    "input_audio_transcription": {"model": "whisper-1"},
    "turn_detection": null,
    "temperature": 0.8
  }
}
{
  "type": "conversation.item.create",
  "item": {
    "type": "message",
    "status": "completed",
    "role": "system",
    "content": [{"type": "input_text", "text": "Say hi to the user."}]
  }
}
{
  "type": "conversation.item.create",
  "item": {
    "type": "message",
    "status": "completed",
    "role": "assistant",
    "content": [{"type": "text", "text": "Hello, how can I assist you today?"}]
  }
}
{
  "type": "conversation.item.create",
  "item": {
    "type": "message",
    "status": "completed",
    "role": "user",
    "content": [{"type": "text", "input_text": "Hello, can you tell me a joke?"}]
  }
}
{
  "type": "conversation.item.create",
  "item": {
    "type": "message",
    "status": "completed",
    "role": "system",
    "content": [{"type": "input_text", "text": "The user interupted the conversation, continue from where you stopped."}]
  }
}

After this I send a response.create message:

{
    "type": "response.create", "response": {"modalities": ["text", "audio"]},
}

The issue I’m experiencing is that sometimes I only get text responses without audio, or I encounter errors for some messages. I’ve been unable to get it working reliably.

If anyone has insights, tips, or a working example, I’d greatly appreciate your help!

Thanks in advance!

j0rdan · December 11, 2024, 8:18am

ibiscp:

{
  "type": "conversation.item.create",
  "item": {
    "type": "message",
    "status": "completed",
    "role": "user",
    "content": [{"type": "text", "input_text": "Hello, can you tell me a joke?"}]
  }
}

When role is user, your content part type must beinput_text. Also the other key must be text instead of input_text:

[{"type": "input_text", "text": "Hello, can you tell me a joke?"}]

Also, make sure to catch and log any error event you receive as it’s very easy to miss them.

That is a known issue, I haven’t been able to get it to work reliably either. The model tends to drop out of voice mode when you construct the conversation history with text messages.
One thing you could do, is to get the model to generate a summary of the conversation at the end of a session and send that in your next session. This is a gimmicky workaround though. It only works when the conversation ends gracefully (e.g. you don’t get disconnected from the session). It might help you get started until we get an official fix for this.

ibiscp · December 11, 2024, 4:55pm

You are right, that was an error while writing the question here, in my code it is correct.

Do we have any contact with OpenAI team regarding this issue?

edwinarbus · December 12, 2024, 5:40pm

Sorry about this. This is a bug we’re currently looking into (reminder, the Realtime API is still in beta ). One workaround is to add an audio message as your first user message, before the other history, which might help coax the model to respond with audio as well.

ibiscp · December 12, 2024, 8:08pm

Thank you for your response! I gave that approach a try by adding a 1-second silent audio as a user message before the others, but unfortunately, it didn’t work. I believe the best solution would be the one suggested by @j0rdan summarizing the previous conversation and including it as context.

Foxalabs · December 12, 2024, 8:42pm

Can confirm this works in a couple of use cases I am involved with!

j0rdan · December 16, 2024, 9:58am

Thanks for the clarification. I tried your suggested workaround, but it doesn’t work reliably (I saw it work a couple of times, but nowhere near a reliable amount of times to make it viable).
If the conversation history is quite long, there will be a lot of assistant messages of type text, which I think misguides the model to respond with text from that point onward.
Looking forward to any further updates on this.

killiandunne · January 20, 2025, 8:39pm

Hey any update here? This has caused me literal days of frustration - any suggestions or fixes would be hugely appreciated

ibiscp · January 21, 2025, 6:28pm

I tried it today again and it is still not working.

dima7711 · May 16, 2025, 9:14am

The bug still exists. Are there any plans to fix it?

alexred20 · June 24, 2025, 7:35pm

@edwinarbus any progress on fixing this? It was highlighted 6 months ago and completely breaks the calls in our app. It happens frequently across our userbase. We’ll attempt a workaround but it’s far from ideal. FYI we use the mini model as standard is too expensive.

Topic		Replies	Views
How can I switch from text generation to audio generation? API realtime	11	1441	February 22, 2025
Realtime API: Did anybody managed to provide previous conversation transcript history while keeping audio answers? Bugs realtime	10	2177	February 19, 2025
Realtime API: Updating Modalities API voice , advanced-voice , realtime , api-realtime-speech	13	1929	July 8, 2025
Issue: OpenAI Realtime API Sometimes Only Responding with Text (No Audio) in Sessions With context API realtime , api-realtime	2	410	March 29, 2025
Unable to Access User Audio Transcript in Realtime API API api-realtime	5	1610	February 10, 2025

Trouble Loading Previous Messages with Realtime API

Related topics