Wondering if anyone else is experiencing this, it happens quite often - where everything will be streaming along fine through reasoning_summary - but as it reaches the last two or three tokens, it stalls - sometimes over 10 seconds, before it quickly doles out the last tokens of that part and then moves on.
In the example above, between line 3 and 4 you’ll see a 14 second delay…
I don’t feel like I see the same in ChatGPT and don’t feel as if its my implementation, as I’m using the same logic to stream the regular chunks of text and never run into this kind of problem - only with the tail end of reasoning summaries.
Anyone have any thoughts?