GPT-4o Transcribe Inconsistent with Paused Audio - Drops Segments

harshidkhunt09 · July 15, 2025, 4:43am

Description:
When using the gpt-4o-transcribe model for speech-to-text conversion, the model fails to properly handle paused audio. During pauses (e.g., silence or gaps in speech), the output becomes inconsistent, sometimes dropping segments, or only transcribing partial input.

Steps to Reproduce:

Send an audio file/stream containing pauses (e.g., “Best places to visit in london…Best places to visit in uk”)
Use the OpenAI API with gpt-4o-transcribe (via cURL as per docs)
Observe inconsistent outputs such as:

Dropped segments: "Best places to visit in uk" (first part missing)
Partial segments: "Best places to visit in london" (second part missing)

Expected Behavior:
The model should consistently transcribe the full audio input, including pauses, similar to Whisper’s output:
"Best places to visit in london Best places to visit in uk"

Whisper model handles the same input correctly every time but not gpt-4o-transcribe.

Ankit_Gupta · November 12, 2025, 11:24am

Hi @harshidkhunt09 ,
I am also getting same issue. I am using local VAD and then sending (commiting) chuck to GPT-4o-transcribe. I am facing similary issue in this case. Do you get the solution for this.

harshidkhunt09 · November 14, 2025, 4:27pm

Hi @Ankit_Gupta,

You can fix this by using the chunking_strategy parameter provided by OpenAI.

Topic		Replies	Views
GPT-4o-transcribe truncation issue after the diarize API update API gpt-4o-transcribe	0	218	October 29, 2025
Gpt-4o-transcribe leaves out words after pauses in the speech Bugs	0	99	March 26, 2025
Whisper ASR Model Skipping Chunks in Audio Transcription Community whisper , transcribe	1	893	May 20, 2025
Gpt-4o-transcribe truncates the transcript API transcribe	15	3043	August 29, 2025
Gpt-4o-transcribe truncates output after ~8-9 minutes even on short segments Bugs transcribe	3	510	August 29, 2025

GPT-4o Transcribe Inconsistent with Paused Audio - Drops Segments

Related topics