MODEL = “gpt-4o-realtime-preview”
MODEL = “gpt-4o-mini-realtime-preview”
Advertised context window: 128k tokens
- If initial_context_message (in the code below) is over about 8000 tokens, the model no longer answers follow-up questions properly. It only states, hello, I am here to help, or something similar.
- It also starts responding in text, when I am interacting with it by audio.
- With shorter token inputs, it works fine: I ask in audio, it properly responds in audio.
Another user @benko.csaba noticed something similar: Context limit smaller than documented
await stream.conversation.item.create(
item={
"type": "message",
"role": "system",
"content": [
{"type": "input_text",
"text": initial_context_message}
],
}
Update:
- Right after writing this post, I figured the responses will come in too slow and without a fix since OpenAI has a lawsuit and data retention to worry about.
- Therefore, I tried this:
- MODEL = “gpt-4o-realtime-preview-2025-06-03”
- This actually fixed it. However, I do not have the pockets for it. I need the mini. Thank you.
- MODEL = “gpt-4o-realtime-preview-2025-06-03”