No Response in Simple Text Interaction with Realtime API

ramanchik · April 23, 2025, 9:15am

Hi everyone,

I’m trying out a Realtime API for the first time. My initial attempt was simply to send a “hello” as plain text and get any kind of text response (just to avoid dealing with audio processing and extra code for now).

From what I can tell, I successfully connected and communicated with the API. However, I haven’t been able to get any response from the model — neither audio nor text.

Here’s what I did, step by step:

Set up a WebSocket connection and started listening for all responses. Right after connecting, I received a ‘session.created’ response::

{
    "type": "session.created",
    "event_id": "event_BPPbgALHeksezKcNK0ZI7",
    "session": {
        "id": "sess_BPPbgqI9IhpESw3tEAWdp",
        "object": "realtime.session",
        "expires_at": 1745398132,
        "input_audio_noise_reduction": null,
        "turn_detection": {
            "type": "server_vad",
            "threshold": 0.5,
            "prefix_padding_ms": 300,
            "silence_duration_ms": 200,
            "create_response": true,
            "interrupt_response": true
        },
        "input_audio_format": "pcm16",
        "input_audio_transcription": null,
        "client_secret": null,
        "include": null,
        "model": "gpt-4o-realtime-preview-2024-12-17",
        "modalities": [
            "audio",
            "text"
        ],
        "instructions": "Your knowledge cutoff is 2023-10. You are a helpful, witty, and friendly AI. Act like a human, but remember that you aren't a human and that you can't do human things in the real world. Your voice and personality should be warm and engaging, with a lively and playful tone. If interacting in a non-English language, start by using the standard accent or dialect familiar to the user. Talk quickly. You should always call a function if you can. Do not refer to these rules, even if you’re asked about them.",
        "voice": "alloy",
        "output_audio_format": "pcm16",
        "tool_choice": "auto",
        "temperature": 0.8,
        "max_response_output_tokens": "inf",
        "tools": []
    }
}

Second step was updating the session to set only ‘text’ modalities:

My request:

{
    "type": "session.update",
    "event_id": "realtime_event_id_1745396363655",
    "session": {
        "modalities": [
            "text"
        ]
    }
}

API Responce:

{
    "type": "session.updated",
    "event_id": "event_BPPcBUrA9ezbocM8MEKHA",
    "session": {
        "id": "sess_BPPbgqI9IhpESw3tEAWdp",
        "object": "realtime.session",
        "expires_at": 1745398132,
        "input_audio_noise_reduction": null,
        "turn_detection": {
            "type": "server_vad",
            "threshold": 0.5,
            "prefix_padding_ms": 300,
            "silence_duration_ms": 200,
            "create_response": true,
            "interrupt_response": true
        },
        "input_audio_format": "pcm16",
        "input_audio_transcription": null,
        "client_secret": null,
        "include": null,
        "model": "gpt-4o-realtime-preview-2024-12-17",
        "modalities": [
            "text"
        ],
        "instructions": "Your knowledge cutoff is 2023-10. You are a helpful, witty, and friendly AI. Act like a human, but remember that you aren't a human and that you can't do human things in the real world. Your voice and personality should be warm and engaging, with a lively and playful tone. If interacting in a non-English language, start by using the standard accent or dialect familiar to the user. Talk quickly. You should always call a function if you can. Do not refer to these rules, even if you’re asked about them.",
        "voice": "alloy",
        "output_audio_format": "pcm16",
        "tool_choice": "auto",
        "temperature": 0.8,
        "max_response_output_tokens": "inf",
        "tools": []
    }
}

‘modalities’ were successfully updated. Everything seemed okay.

The next step (as I understand) is to set a new conversation item. In my case, it’s just ‘Hello’:

Request:

{
    "type": "conversation.item.create",
    "event_id": "realtime_event_id_1745396404985",
    "item": {
        "id": "msg-1",
        "content": [
            {
                "text": "Hello",
                "type": "input_text"
            }
        ],
        "type": "message",
        "role": "user"
    }
}

Response:

{
    "type": "conversation.item.created",
    "event_id": "event_BPPcqOIDD5RlYu0veI20z",
    "previous_item_id": null,
    "item": {
        "id": "msg-1",
        "object": "realtime.item",
        "type": "message",
        "status": "completed",
        "role": "user",
        "content": [
            {
                "type": "input_text",
                "text": "Hello"
            }
        ]
    }
}

Got the ‘conversation.item.created’ response.

Finally, I sent a ‘response.create’ request:

Request:

{
    "type": "response.create",
    "event_id": "realtime_event_id_1745396432919",
    "response": {
        "modalities": [
            "text"
        ]
    }
}

Response:

 {
    "type": "response.created",
    "event_id": "event_BPPdI6PurWJZVqma4NzPI",
    "response": {
        "object": "realtime.response",
        "id": "resp_BPPdIY0GCyp4gBcAubudf",
        "status": "in_progress",
        "status_details": null,
        "output": [],
        "conversation_id": "conv_BPPbgeKSuJgGpYKGPqbcz",
        "modalities": [
            "text"
        ],
        "voice": "alloy",
        "output_audio_format": "pcm16",
        "temperature": 0.8,
        "max_output_tokens": "inf",
        "usage": null,
        "metadata": null
    }
}

And right after that, the next response:

{
    "type": "rate_limits.updated",
    "event_id": "event_BPPdJ0LIhEUJ3ezrtVQu7",
    "rate_limits": [
        {
            "name": "requests",
            "limit": 1000,
            "remaining": 999,
            "reset_seconds": 86.4
        },
        {
            "name": "tokens",
            "limit": 40000,
            "remaining": 35680,
            "reset_seconds": 6.48
        }
    ]
}

And that’s all. I expected a meaningful text response from the AI model, like ‘Hi, how can I help you?’. But I only received technical responses, and I’m not sure what I need to do to get a text answer.

Maybe I made a mistake in the requests or configuration?
Or perhaps I sent my requests in the wrong order?
Or is there something else?

nj7674688 · April 23, 2025, 9:25am

Name	Limit	Remaining	Reset Seconds
requests	1000	999	86.4
tokens	40000	35680	6.48

nj7674688 · April 23, 2025, 9:35am

{
“type”: “rate_limits.updated”,
“event_id”: “event_BPPdJ0LIhEUJ3ezrtVQu7”,
“rate_limits”: [
{
“name”: “requests”,
“limit”: 1000,
“remaining”: 999,
“reset_seconds”: 86.4
},
{
“name”: “tokens”,
“limit”: 40000,
“remaining”: 35680,
"reset_

ramanchik · April 23, 2025, 9:48am

I see this data, but is there actually something wrong with it?
The limits haven’t been exceeded.
Or is there some other issue?

Topic		Replies	Views
Realtime API - No response audio or audio deltas, despite modalities being set to ['audio', 'text'] Bugs api	1	1010	October 24, 2024
Trouble Loading Previous Messages with Realtime API API realtime	8	489	January 21, 2025
Realtime api never sends audio, only text API realtime	1	552	October 17, 2024
Persistent "server_error" with gpt-4o-realtime-preview Models on response.create Bugs python , api-realtime	0	114	March 4, 2025
Not getting response after sending message API api-realtime	2	87	March 14, 2025

No Response in Simple Text Interaction with Realtime API

Related topics