Gpt-4o-transcribe truncates the transcript

leeds · March 21, 2025, 12:15am

I’m using the transcription endpoint at /v1/audio/transcriptions. I’ve noticed that when I use the gpt-4o-transcribe or gpt-4o-mini-transcribe models with this endpoint, the transcript that I receive back is truncated - usually sometime between 10-11 minutes into my 12-minute recording. However, when I use the whisper-1 model, I get the full transcript back.

To clarify - the JSON response itself is well-formed and complete; it’s just that the text property of the response doesn’t have the entire content of the audio file.

I’ve tried audio files of different formats and sizes - mp3, mp4/aac, lower bitrates to reduce file size, mono vs. stereo, etc. But I can’t seem to get it to process the entire file. The file sizes I’ve tried have ranged from 8.5 MB to 17.0 MB.

Naturally, I could just use the whisper-1 model, but the other models have been more accurate for the niche topics I’m discussing in the recording, so I’d prefer to use them.

Has anyone else run into this issue, or have any ideas how to work around it?

ddrechsler · March 21, 2025, 12:56am

Another q, have you tried .ogg in the new gpt-4o-transcribe? whisper had no problem and its how I’m gettign around the 25MB limit (by compressing the audio as much as possible) I get an error in the new model when I try upload .ogg

ThioJoe · March 22, 2025, 12:53am

Same problem here. Uploading an mp3 which is well within the 25 MB limit, yet the transcription cuts off just under 10 minutes. I tried it with two audio files, one about 10m30s and the other ~12m and both cut off about the same time.

scott_no35 · March 22, 2025, 9:26pm

Iv had the same issue. Im using the MediaStreamTimeProcessor recorder that autoslices the audio every 45 sec. With whisper-1 it works perfectly, and I dont experience any truncated transcripts. However with the new gpt-4-transribe model, my audiochunk transcripts often lack a good part of the recording, usually at either the beginning or end of chunk, mostly the end. And its mostly also the last chunk that becomes truncated. I.e the chunk that was sliced when manually stopped the recording. Trying to figure it out but cant find any solution.

_j · March 23, 2025, 12:45pm

I guess the big question here:

Is this just a single AI run with one context window that can be exhausted? Or, are there internal techniques to instead split received audio and keep context down by doing similar to the 30 second operations on long audio as Whisper-1?
Does the intelligent AI get fed up with doing the work and emit a stop sequence instead?

It would seem that doing 80% of the job rather consistently as a fault would require observability of 80% of the job.

kgmodi · March 24, 2025, 12:05am

I have noticed the same patterns. For example, in a 10 minute conversation (ogg format, 3.4MB file), the content from the 10th minute in the transcript is missing. I can reproduce this for gpt-4o-transcribe and gpt-4o-mini-transcribe. However, whisper-1 does produce a complete transcript.

AI_issues · March 29, 2025, 3:47am

I have same exact issue but with pyaudio recordings, split every 60 sec. It’s not the duration - it’s the context window for me. Shifting even by a couple seconds makes gtp4-transcribe understand. Otherwise, it interprets only 1 sentence. Whisper works. Temperature won’t save anyone this time. Competition is coming, so they need to move beyond the transcribe model being just a Youtube learning pipe.

o.auth · April 6, 2025, 6:56pm

I’m experiencing the same issue. I use the model for voice input to my computer, and after switching from Whisper to gpt-4o-transcribe, I immediately noticed this problem (with German language input).

The last sentence is frequently missing from my transcriptions. Sometimes two sentences combined into one. I’ve found that it seems to help if you don’t end the recording immediately after speaking, but instead wait two to three seconds before stopping.

On the positive side, the new model definitely transcribes many technical terms correctly that were problematic before. For example, ChatGPT is no longer transcribed as “JetGPT,” and when speaking about OpenAI models, they’re correctly identified now.

Despite these improvements, the truncation issue is very frustrating. I hope OpenAI addresses this soon or provides API parameters that allow us to control this behavior.

o.auth · April 7, 2025, 4:50pm

Switched back to whisper. It leaves out whole senences all the time. Sometimes also sentences in the middle. Gpt-4o-transcribe is zero reliable and just unusable for me.

t.vogt.84 · May 27, 2025, 10:59am

I have the same problem with all my recordings using gpt-4o-transcribe. I have chunks of 17min files of mp3 and it cuts off a bit more than 4min. I’m all the time below the 25MB limit. This is frustrating to see, especially if you have processed already a lot of files.
And also if the cut is in the middle of a sentence, the new generated text file doesn’t include the second part of the sentence. It seems it is not just a transcription, it also tries to use AI to avoid parts that don’t are meaningful. If you have just a second part of a sentence at the beginning of a recording, that do not make sense on its own, it will be left out at all and the transcription starts as soon as it can detect a meaningful part.

j.bennink · May 28, 2025, 12:54pm

I also have the same problem. I’m sending an audio file in wav format (a call recording) that is 16MB in size and the recording is 17 minutes in length. I also have a trncated transcription, but I’m not sure where since I did it in the portal which just spits out the text. In my code (C# program) I get an error:

HTTP 400 (invalid_request_error: invalid_value)
Parameter: file

Audio file might be corrupted or unsupported’

but the same file will upload via the Azure AI Foundry Audio playground, but thewn it just truncates the transcription without notification!

j.bennink · May 28, 2025, 2:15pm

I just did some more experiments, the error was due to the fact that I was supplying a byte, and the settings object had a default of “file.mp3”, but my file was a wav audio file. Mystery solved.

I could do a transcription with my ocde now, and still it is not complete. However it is not truncated directly, but around the 8 munute mark the transcription just skipped some minutes to continue, and it also looks like it just plops older piece in there.

I did notice it seems to happen when ther eis some silence in the call where both parties don’t speak or mumble very softly.

scott_no35 · July 5, 2025, 7:57pm

I have found a solution to the truncation issue with gpt-4o-transcribe STT.

I simply added a prompt that explicitly instructs the model NOT to omit or leave anything out of the transcript.

"formData.append(“model”, “gpt-4o-transcribe”);
formData.append(“temperature”, “0.2”);
formData.append(“prompt”,
“Only transcribe spoken words; exclude all non-verbal and background noises.” +
“Do NOT omit, summarize, or “clean up” anything related to spoken words. " +
“Output every word as spoken. Do NOT truncate or leave out anything in the transcript, that is spoken”
);”

I have done extensive testing now, and I get zero truncation. Also, the transcripts are much better than whisper-1. Please try it out, and give some feedback on whether or not it works for you guys.

AI_issues · July 31, 2025, 4:58pm

Or perhaps the app got better with time. Or perhaps everyone learned to accept 80% with AI. If the entire world would be okay with 80%, we would get more done. Perfection is the enemy of 80%.

jeffvpace · July 31, 2025, 6:22pm

This is a similar to this thread:

matthewkogan · August 29, 2025, 11:37am

It could be that you’re hitting the 2,000 output tokens limit of the gpt-4o-transcribe model.

Topic		Replies	Views
Persistent Truncation Issues with GPT-4o-Transcribe – Has Anyone Fully Solved This? API gpt-4 , api , transcribe , gpt-4o , api-realtime	11	920	July 30, 2025
Gpt-4o-transcribe truncates output after ~8-9 minutes even on short segments Bugs transcribe	3	163	August 29, 2025
Whisper API server error for long (not big) files API whisper	7	3780	December 18, 2023
Whisper ASR Model Skipping Chunks in Audio Transcription Community whisper , transcribe	1	577	May 20, 2025
GPT-4 Turbo is lazy and truncates output arbitrarily API	3	1381	March 4, 2024

Gpt-4o-transcribe truncates the transcript

Related topics