Persistent Truncation Issues with GPT-4o-Transcribe – Has Anyone Fully Solved This?

Hi everyone,

I’ve spent a lot of time trying to build a reliable speech-to-text pipeline using OpenAI’s transcription models—both through the /v1/audio/transcriptions endpoint and the new real-time /v1/realtime WebSocket API (using the gpt-4o-transcribe model). I’ve tested this through a custom browser-based web app with a direct WebSocket connection and a range of variations, including different chunk sizes, VAD settings, and silence durations.

Despite all this, I still consistently run into the same issue: a high frequency of truncated transcripts.

To clarify:

  • The transcriptions I do get are high-quality and accurate.
  • But large parts of the audio are simply missing from the final transcript.
  • This occurs both for short clips (2–3 minutes) and for longer conversations.
  • I use this in my work to transcribe real-time conversations between two people, so completeness is essential.

I’ve searched extensively online, including this forum, Reddit, GitHub, and developer blogs, but I haven’t found anyone who explicitly claims to have solved this issue 100%—as in, no truncation, ever, under realistic usage conditions.

So my question is:
Has anyone here successfully built a system using gpt-4o-transcribe (especially over WebSocket in real-time) that consistently avoids truncation and always returns complete transcripts?

If so, I would deeply appreciate:

  • A link to working code or an open-source repo
  • Any insight into what might be causing the truncation

Thanks in advance to anyone who can help point me in the right direction. This has become a major blocker for real-world use, and it would be great to hear from someone who has managed to overcome it.

3 Likes

Unfortunately this is a know issue, so far with no solution that I’ve heard of.

When I have sensitive content, I’m still using the whisper-1 model, that is slightly inferior but more resilient on truncation issues.

1 Like

I feel its abit weird. I havent seen any official statement from OpenAI commenting on the incomplete function of the gpt-4o-transribe. So far i have seen absolutelly zero people online, that has managed to get complete non-truncated transcripts using the gpt-4o-transcribe. The model came out a good few months ago now, and i havent seen any solution.