Streaming is now available in the Assistants API!

coderanger · March 19, 2024, 2:02pm

Oh man this is awesome! I can’t wait to add support for it !

KeyboardKlutz · March 20, 2024, 12:41pm

I can not for the life of me figure out how to resolve my import issues in the python library, I just keep getting cannot import name ‘AssistantEventHandler’ from ‘openai’ or no openai.beta and stuff like that. I am using python 3.9 if that is relevant, hopefully this is an OK thread to ask this. All of the examples from the documentation produce errors, and I have ensured I have updated the library. Any help would be so, so awesome. Thanks!

jhakulin · March 20, 2024, 1:57pm

Did you upgrade openai library >= 1.14.0 version?

marketing5 · March 20, 2024, 2:54pm

Works great! Already running this live, and it’s just as fast as the ChatCompletions route. Now we just need “temperature” and I’m happy. Without temperature settings I notice that the replies I get are just to creative for my usecases. I can’t prompt it out, so temperature is neccesary.

devakumaraswamy25 · March 21, 2024, 12:01pm

@kachari.bikram42 Have you noticed any improvement on the total response time using streaming? I cannot see using any apps (even internally for evaluation) given the current performance.
I see the same performance issue with azureai version of the assistant.
Cheers…

waseem_gul · March 22, 2024, 2:58am

This is how I am handling it.

with client.beta.threads.runs.create_and_stream(
        thread_id = thread_id,
        assistant_id = assistant.id,
    ) as stream:
        for event in stream:
            if event.event == 'thread.run.step.created':
                if event.data.type == 'tool_calls':
                    print('\nTool calls detected..')
                    final_run = stream.get_final_run()
                    yield from requires_action(final_run, query)
                    break
                else:
                    print('\nMessage creation detected...')
                    for text in stream.text_deltas:
                        yield f"data: {json.dumps({'text': text})}\n\n"
            elif event.event == 'thread.message.delta':
                yield f"data: {json.dumps({'text': event.data.delta.content[0].text.value})}\n\n"

nttien1197 · March 22, 2024, 6:16am

Any plan to add support for this for Azure OpenAI. Thanks!

kachari.bikram42 · March 22, 2024, 7:50am

I haven’t observed any significant improvement . With streaming it’s just that in the UI, a user now doesn’t have to wait for the entire response to be displayed. It helped me improve the user experience.

willowwisp · March 23, 2024, 1:26pm

I just wrote an example on medium here https://medium.com/@hawkflow.ai/openai-streaming-assistants-example-77e53ca18fb4

Seunghun · March 26, 2024, 1:43am

Hi! I’m undergraduate student in South Korea. Thanks for sharing your code.
I have a few questions about your code.

What’s the meaning of requires_action function and query parameter?
I’ve tried your code but tool_calls ouput is not yielded.
How can I get code interpreter outputs?

devakumaraswamy25 · March 26, 2024, 1:03pm

@kachari.bikram42 Thanks for the feedback. I will give the streaming a try when I get a chance.
Cheers…
Deva

rokbenko · March 28, 2024, 1:47pm

Tutorial on how to implement the response streaming functionality in Python and Node.js

I built a terminal user interface to be able to chat with a customer support chatbot in the past (see the YouTube tutorial). Today, I created a new YouTube tutorial and added the response streaming functionality.

There is an example for both Python and Node.js. See my GitHub repository with full code for the tutorial.

screenshot_short

william.zebrowski · March 29, 2024, 1:20pm

Hey @waseem_gul - any ideas how one can point the streaming to a client side chat interface?

nikunj · March 29, 2024, 10:01pm

We just added support for temperature! Hope it works well for your use-case.

danny-avila · March 29, 2024, 11:32pm

Woah nice. I was like, I didn’t see this on the changelog: https://platform.openai.com/docs/changelog

But then I looked at your comment timestamp haha

hq1 · April 1, 2024, 6:50pm

Hi @rokbenko thanks for sharing this code - have you managed to get this to a nextjs (or other) frontend ui? I am currently only able to get the response returned to my chat interface (nextjs) via websockets, only once i get the answer completed in the backend (python/fastapi). i am testing both nextjs and dash as frontends, and i am possibly missing something really simple to get the real-time stream to my frontend… thanks

jschmid · April 3, 2024, 11:06pm

Hey hq1 I ran into your comment when I was trying to find an answer myself a few days ago. The things I had to do before it worked properly for me were 1. Make everything Async 2. correctly import AsyncOpenAI and AsyncAssistantEventHandler rather than their synchronous alternatives. 3. Switch from Flask to FastAPI to use WebSockets instead of endpoints and send each on_message_delta through the socket to be displayed.

Hopefully you just missed one of these steps and can fix it real quick. I’m using python for my backend and just pure js in my frontend.

icdev2dev · April 3, 2024, 11:30pm

Flask with eventlet.sleep(0) works to stream.

AerisAnjali · April 5, 2024, 6:55am

In my Python Flask application, would I have to refactor it to incorporate asynchronous functionality in order to utilize the stream, or can I integrate it directly into my existing setup?

vonv · April 5, 2024, 10:00am

Is there or will there be an endpoint exposed to get the stream?

Topic		Replies	Views
Using Streaming Assistants API With Websockets API assistants-api	9	921	January 21, 2025
Has anyone managed to get a tool_call working when stream=True? API api , function-calling	22	20465	May 24, 2024
[Critical] Over 25% Assistant API Request Timeout Randomly API	81	6003	March 18, 2024
Streaming from Text-to-Speech api API api , python , tts	53	53366	January 21, 2025
Multiple function calls with streaming API gpt-4 , function-calling , streaming	6	4762	April 5, 2024

Streaming is now available in the Assistants API!

Tutorial on how to implement the response streaming functionality in Python and Node.js

Related topics