How do you stream assistants API responses?

Assistants with function calling can take a long time to respond (sometimes up to 2mins), making UX a bit weird. We need a streaming feature for this urgently!


Models with transmission are experiencing errors in token counting and consumption is 78% higher than real.

Have you come up with a solution? If yes please explain and what if we use GPT 4, then we get response faster

The example they give is with the Completions API, not sure if it works with Assistants API.

Currently, streaming isn’t support for assistant api. Mentioned in the documentation .

1 Like