The "Audio out" wav is less than "Generation"

hl_l · February 3, 2026, 8:01am

I I am using realtime api with sip to communicate with openai, is this normal or a bug?

_j · February 3, 2026, 1:33pm

The text is an AI transcription from hearing the produced audio. It can make mistakes.

Does the audio file itself sound complete, or is it truncated mid-speech?

hl_l · February 5, 2026, 3:09am

the audio file only lost “Correct?” than the text output.

our prompt wants ai to say “Let me confirm your order. You ordered [[all dishes]]. The total is ${{price}}. Correct?“, so the text output is what we want, but the audio output is wrong.

_j · February 5, 2026, 7:45am

I can’t say if it is a bug, so much as a model behavior.

The first thing I would try: have that be a complete sentence: “Is your order correct?”

Then prompt up the phrase more in system message as a final output requirement after stating or restating someone’s order.

You can’t place your own “assistant” messages as audio, I suspect so that you can’t influence the speech. However, you could inject a user message early in proper context, “system reminder, after reciting an order you must employ the phrase Is that correct” - and then place that phrase as a recording of the chosen voice model’s output, it saying the message in the tone you want.

You can consider other turns of an order conversation that also need structure reinforced in similar manner with more verbose language that won’t result in truncation by the generated audio token stream trailing off or whatever is happening.

hl_l · February 5, 2026, 8:33am

I have tried the follow prompt before:
“Let me confirm your order. You ordered [[all dishes]]. The total is ${{price}}. Is everything correct?”
then the text output is correct, but the audio output lost the “Is everything correct?” sometimes

As the same, I have another prompt to make the AI say:
“Please wait, I’ll place your order now. May I have your name?”
the text output is correct, but the audio output lost the “May I have your name?” sometimes

Topic		Replies	Views
The output audio does not fully match the output text; it ends early API api , realtime	2	518	October 11, 2024
Completions of gpt-4o-mini-audio-preview model missing audio in response Feedback typescript	1	241	June 7, 2025
Realtime conversations API - text and audio not consistent. Advice? Bugs	1	173	July 2, 2025
Response audio suddenly cuts off when using "gpt-4o-audio-preview-2024-12-17" Bugs bug , gpt-4o-audio-preview	3	660	February 19, 2025
Realtime API re-consuming it's own output audio as input audio API audio , realtime , api-realtime , api-realtime-speech	10	1313	January 10, 2025

The "Audio out" wav is less than "Generation"

Related topics