Async AssistantAPI Streaming Beta

william.zebrowski · March 27, 2024, 12:19am

There are not many examples out there but curious if anyone has any luck with using the Assistants API (beta) in a async manner to push the stream to a Front End.

My applications is in python and using FastAPI as the BE Server.

Responses are taking a bit to send in full back to the user and my hope is with streaming the user will atleast start getting the response much quicker.

icdev2dev · March 27, 2024, 12:53am

I have NOT used async AssistantApi. Primarily because i am using Flask and while i can create a thread on which to run Aysncio, that thread does not share context with the main Flask (& too lazy to figure out how).

That said, in context of Flask, i have had to do eventlet.sleep(0) in the ’ for event in stream’…ymmv

pondin6666 · March 27, 2024, 1:30am

For FastAPI, I ever implement streaming with chat completion successfully.

I think assistant should work as well, but one thing need to check: Do you enable gzip middleware or anything similiar?
Just remove gzip middleware make my streaming to work. FYI.

duncansmothers · April 24, 2024, 5:10pm

I’ve spent a few days working through this and haven’t fully cracked it.

There aren’t really any examples I can find so far, will come back and post if I find a good one.

brandonareid2 · May 1, 2024, 8:55am

@duncansmothers are you still looking for an example? I’ve come up with a rather basic solution using FastAPI which allows for streaming via SSE.

duncansmothers · May 1, 2024, 4:25pm

Hey, yeah I’d be interested in seeing it.

I found a way to make it work via FastAPI / websockets but i’m not sure it’s the best solution.

askaraliazaruddin · May 7, 2024, 8:27am

Hey @brandonareid2 , Can you share the example around streaming through SSE, would be interested to see it!

brandonareid2 · May 9, 2024, 12:58pm

I created a medium article walking through the implementation, but for some reason I’m not allowed to post links here.

You can find in my profile a link to my medium. At the bottom of that article is a link the the example repository.

limhss · May 23, 2024, 1:19am

@brandonareid2
Got it to work with my Telegram messagehandler. Telegram user is able to get questions asked about the files attached to the assistant. I tried to incorporate functions. However I could not figure out how to get it to work. I got as far as to add on_event. While I used api to send the output via assistant api and examine the stream response, I could see the message created by ChatGPT but not stream it out to Telegram. Will you be working to enhance your solution support functions?

Currently, I am calling the submit_tool_outputs API with stream=True via a requests.post and parse the string to find “thread.message.completed” for the associated data and issuing self.queue.put_nowait(data[‘content’][0][‘text’][‘value’])

jhakulin · May 24, 2024, 4:10am

Here is one basic example for streaming with OpenAI assistant using SSEs and Quart application as backed and JS frontend
azureai-assistant-tool/samples/FileSearch at main · Azure-Samples/azureai-assistant-tool (github.com)

kinana · July 16, 2024, 11:48pm

Some medium articles cover the topic but i found them to be too complex for what i need, you can search Meduim and take a look to check if the solutions there better fit your use case.
After looking around a bit though, i found a simpler way to achieve streaming using SSE with FastAPI with the following example snippet:

async def stream_assistant_response(thread_id, assistant_id):
  
    client = AsyncOpenAI()

    stream = client.beta.threads.runs.stream(
        thread_id=thread_id,
        assistant_id=assistant_id,
    )
    async with stream as stream:
        async for text in stream.text_deltas:
            yield f"data: {text}\n\n"

Make sure you also set the media_type correctly in the API response logic:

return StreamingResponse(
        stream_assistant_response(thread_id, assistant_id), media_type="text/event-stream"
    )

this code works fine for me when i test the endpoint on postman, i am able to see every event with it’s corresponding text_delta.
not sure if there are underlying issues with the logic that i can’t catch atm.

you can find the references that were helpful for me in the helpers.md file in OpenAI’s python API library repo, and in the issue #1473 on the repo as well.
Hope that helps!

Topic		Replies	Views
Integrate Assistant API Streaming through websocket API python , assistants-api , assistants-streaming	4	5251	March 26, 2024
Flask Streaming Examples? API assistants-api	2	1232	September 7, 2024
How to use the async version of the Streaming API API assistants-api	0	2130	August 23, 2024
Using Asynchronous Client with AsyncOpenAI API api , assistants-api	12	49178	October 8, 2024
How to forward OpenAI's stream response using FastAPI in python? API api , python , assistants-api , assistants-streaming	4	4239	March 5, 2025

Async AssistantAPI Streaming Beta

Related topics