We’re currently using the Assistants API, and since yesterday we’ve been experiencing a serious issue where streamed responses are getting cut off mid-reply. This problem is causing significant inconvenience for our customers, yet there has been no acknowledgement or incident report from OpenAI.
When we set stream=false, the responses are returned correctly. However, when stream=true, the replies frequently stop abruptly during generation. This is severely impacting our production service, so we urgently request an immediate fix or official guidance on this matter.
I am experiencing the same streaming issue described here, both in my own applications and when using the Assistants Playground on the OpenAI platform. In my tests, streamed responses almost always stop mid-reply when a request takes more than roughly 15 seconds, while responses that complete in under 15 seconds stream correctly.
This behavior occurs across multiple models that I have tried, so it does not appear to be model-specific. The problem has only been present for the past 48 hours; before that, streaming worked as expected.