Azure OpenAI streaming vs. OpenAI directly

Has anyone noticed a change in behavior of the Azure Streaming compared to OpenAI. It seems that Azure has stopped streaming chunks and is now streaming whole sentences. I am wondering why because the usability is now worse then before…

cheers
Alex

3 Likes

SSE Server side events can be in packet sizes or time steps of the server providers choosing, your SSE client should be constructed to accept SSE parts that vary in size from a single character to the full payload. SSE events are often managed with load variation such that during heavy load or with systems that operating very quickly internally, the events are generated at a steady pace that does not overload any subsystems.

3 Likes

Yes, I noticed this problem too. In the openai interface, it can be output verbatim with a good user experience effect. But in azure, users often need to wait for a while, and suddenly a few paragraphs pop up, and the user experience is very poor.

I wonder if there is a way to compensate or get better?

Might be something worth raising a ticket with Azure support about