There are not many examples out there but curious if anyone has any luck with using the Assistants API (beta) in a async manner to push the stream to a Front End.
My applications is in python and using FastAPI as the BE Server.
Responses are taking a bit to send in full back to the user and my hope is with streaming the user will atleast start getting the response much quicker.
I have NOT used async AssistantApi. Primarily because i am using Flask and while i can create a thread on which to run Aysncio, that thread does not share context with the main Flask (& too lazy to figure out how).
That said, in context of Flask, i have had to do eventlet.sleep(0) in the ’ for event in stream’…ymmv
For FastAPI, I ever implement streaming with chat completion successfully.
I think assistant should work as well, but one thing need to check: Do you enable gzip middleware or anything similiar?
Just remove gzip middleware make my streaming to work. FYI.
I’ve spent a few days working through this and haven’t fully cracked it.
There aren’t really any examples I can find so far, will come back and post if I find a good one.