The model cannot technically return a ātupleā which is a Python convention. It streams JSON objects.
You donāt tell us if using the SDK or how you wrote a stream parser.
I just ran a transcription. Hereās the end of the raw stream.
...
data: {"type":"transcript.text.delta","delta":" Arabs"}
data: {"type":"transcript.text.delta","delta":"Welcome"}
data: {"type":"transcript.text.delta","delta":" to"}
data: {"type":"transcript.text.delta","delta":" our"}
data: {"type":"transcript.text.delta","delta":" radio"}
data: {"type":"transcript.text.delta","delta":" show"}
data: {"type":"transcript.text.delta","delta":"."}
data: {"type":"transcript.text.done","text":"In order to be able to talk we just have to agree that we're talking roughly about the same thing. And I know that you know\nAs much about time as I need you to know\nWe got here on time, and you know what that means.\nWelcome to our radio show.\nAnother subtlety involved was already mentioned.\nWelcome to our radio show.\nWelcome to our radio show.\nWelcome to our radio show.\nSo that doesn't work either, and that's another subtlety that we'll have to get around in quantum mechanics.\nBut as we are going to do, we first learn to see what the problems are before the complications, and then we'll be in a better position to correct it for the more recent knowledge on the subject.\nSo we'll take a simple point of view about time and space, you know what it means in a rough way.\nWelcome to our radio show.\nSection 8.2\nSpeed\nNevertheless, there are still some subtleties.\nWelcome to our radio show.\nWell, they could do this all right.\nWelcome to our radio show.\nWelcome to our radio show.\nThe Greeks got very confused about this, and a new branch of mathematics had to be discovered beyond that.\nGeometry and algebra of the Greeks and Arabs\nWelcome to our radio show.","usage":{"type":"tokens","total_tokens":1479,"input_tokens":1184,"input_token_details":{"text_tokens":132,"audio_tokens":1052},"output_tokens":295}}
It also shows that the gpt-4o-transcribe model continues to malfunction to where it cannot be used over reliable whisper-1, here repeating the input āpromptā text about the transcript being a radio show multiple times instead of the audio chunk with 'chunking_strategy': (None, "auto"), used (None there being no filename for the parameter part), when instead prompt should be indicating to the model the lead-up text before the latest audio and never be repeated.
I would suggest similarly logging RESTful requests instead of using the OpenAI SDK, and you can see if your particular input throws out JSON gone goofy, with unexpected events. Or if you are indeed doing your own code, find how you are misinterpreting the sent data.