Streaming is now available in the Assistants API!

PaulBellow · March 13, 2024, 10:53pm

In case you missed it, OpenAI staff dropped by today with a pretty cool announcement…

Check out the assistant API streaming docs …

TimJohns · March 13, 2024, 10:50pm

This came out two hours ago; evaluating now…

github.com/openai/openai-node

feat(assistants): add support for streaming

openai:next ← openai:feat-assistants-add-support-for-

opened 08:30PM - 13 Mar 24 UTC

stainless-bot

+3155 -279

See the reference docs for more information: https://platform.openai.com/docs/ap…i-reference/assistants-streaming We've also improved some of the names for the types in the assistants beta, non exhaustive list: - `CodeToolCall` -> `CodeInterpreterToolCall` - `MessageContentImageFile` -> `ImageFileContentBlock` - `MessageContentText` -> `TextContentBlock` - `ThreadMessage` -> `Message` - `ThreadMessageDeleted` -> `MessageDeleted`

TimJohns · March 13, 2024, 10:55pm

This looks promising; evaluating now…

github.com/openai/openai-node

feat(assistants): add support for streaming

openai:next ← openai:feat-assistants-add-support-for-

opened 08:30PM - 13 Mar 24 UTC

stainless-bot

+3155 -279

See the reference docs for more information: https://platform.openai.com/docs/ap…i-reference/assistants-streaming We've also improved some of the names for the types in the assistants beta, non exhaustive list: - `CodeToolCall` -> `CodeInterpreterToolCall` - `MessageContentImageFile` -> `ImageFileContentBlock` - `MessageContentText` -> `TextContentBlock` - `ThreadMessage` -> `Message` - `ThreadMessageDeleted` -> `MessageDeleted`

engagepy · March 13, 2024, 11:23pm

Any thoughts on how to use `textCreated` & `textDelta` from official example in my client-side bubble.tsx ?

const run = openai.beta.threads.runs.createAndStream(thread.id, {
    assistant_id: assistant.id
  })
    .on('textCreated', (text) => process.stdout.write('\nassistant > '))
    .on('textDelta', (textDelta, snapshot) => process.stdout.write(textDelta.value))

Unable to handle it in my nextjs web app. Error is:

⨯ unhandledRejection: OpenAIError: Cannot read properties of undefined (reading 'write')
    at eval (webpack-internal:///(rsc)/./node_modules/openai/lib/AbstractAssistantStreamRunner.mjs:46:37)
    at process.processTicksAndRejections (node:internal/process/task_queues:95:5) {
  cause: TypeError: Cannot read properties of undefined (reading 'write')
      at eval (webpack-internal:///(rsc)/./app/api/chat/route.ts:36:53)
      at eval (webpack-internal:///(rsc)/./node_modules/openai/lib/AbstractAssistantStreamRunner.mjs:169:47)
      at Array.forEach (<anonymous>)
      at AssistantStream._emit (webpack-internal:///(rsc)/./node_modules/openai/lib/AbstractAssistantStreamRunner.mjs:169:23)
      at AssistantStream._AssistantStream_handleMessage (webpack-internal:///(rsc)/./node_modules/openai/lib/AssistantStream.mjs:333:18)
      at AssistantStream._AssistantStream_addEvent (webpack-internal:///(rsc)/./node_modules/openai/lib/AssistantStream.mjs:314:107)
      at AssistantStream._createAssistantStream (webpack-internal:///(rsc)/./node_modules/openai/lib/AssistantStream.mjs:239:102)
      at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
      at async AssistantStream._runAssistantStream (webpack-internal:///(rsc)/./node_modules/openai/lib/AbstractAssistantStreamRunner.mjs:202:16)
}
 ⨯ unhandledRejection: OpenAIError: Cannot read properties of undefined (reading 'write')
    at eval (webpack-internal:///(rsc)/./node_modules/openai/lib/AbstractAssistantStreamRunner.mjs:46:37)
    at process.processTicksAndRejections (node:internal/process/task_queues:95:5) {
  cause: TypeError: Cannot read properties of undefined (reading 'write')
      at eval (webpack-internal:///(rsc)/./app/api/chat/route.ts:36:53)
      at eval (webpack-internal:///(rsc)/./node_modules/openai/lib/AbstractAssistantStreamRunner.mjs:169:47)
      at Array.forEach (<anonymous>)
      at AssistantStream._emit (webpack-internal:///(rsc)/./node_modules/openai/lib/AbstractAssistantStreamRunner.mjs:169:23)
      at AssistantStream._AssistantStream_handleMessage (webpack-internal:///(rsc)/./node_modules/openai/lib/AssistantStream.mjs:333:18)
      at AssistantStream._AssistantStream_addEvent (webpack-internal:///(rsc)/./node_modules/openai/lib/AssistantStream.mjs:314:107)
      at AssistantStream._createAssistantStream (webpack-internal:///(rsc)/./node_modules/openai/lib/AssistantStream.mjs:239:102)
      at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
      at async AssistantStream._runAssistantStream (webpack-internal:///(rsc)/./node_modules/openai/lib/AbstractAssistantStreamRunner.mjs:202:16)
}

Existing code I am trying to refactor is:

    import OpenAI from "openai";
    import { OpenAIStream, StreamingTextResponse } from "ai";
    
    ....

 
        let run = await openai.beta.threads.runs.create(thread.id, {
          assistant_id: assistant.id,
        });
    
        while (run.status === "in_progress" || run.status === "queued") {
          await new Promise((resolve) => setTimeout(resolve, 500));
          run = await openai.beta.threads.runs.retrieve(thread.id, run.id);
        }
    
        const messages1 = await openai.beta.threads.messages.list(thread.id);
    
        console.log(messages1.data);
    
        let responseContent = "";
        if (messages1.data[0].content[0]?.type === "text") {
          responseContent = messages1.data[0].content[0]?.text.value;
        }
    
        console.log(responseContent);
    
        return new NextResponse(
          JSON.stringify({
            content: responseContent,
            data: {
              threadId: thread.id,
              newThread: newThread,
            },
          })
        );
      } catch (e) {
        throw e;
      }
    }

useEffect in bubble.tsx code where I think I should catch it.

useEffect(() => {
    if (content.role === "assistant") {
      if (content.processing) {
        // Reset displayed content when processing starts
        setDisplayedContent("");
        return;
      }

      if (content?.content && !isLoading) {
        // Split the message into characters and display one by one
        let index = 0;
        const interval = setInterval(() => {
          setDisplayedContent((prev) => prev + content.content.charAt(index));
          index++;
          if (index === content.content.length) {
            clearInterval(interval);
          }
        }, 5);

        return () => clearInterval(interval);
      }
    } else {
      setDisplayedContent(content.content);
    }
  }, [content, isLoading]);

turikhay · March 14, 2024, 1:35am

Great. I’d like to handle thread.run.requires_action in a streaming mode. .on('toolCallDone', ...) could work, but it seems too ambiguous.

petersonwillysantos · March 14, 2024, 2:02am

My app is in Python using Flask, to use the stream will I need to refactor it to be asynchronous or can I use it in my application?

pondin6666 · March 14, 2024, 9:52am

Both python openai library and curl thread run create for new streaming feature functioned…
But it’s very hard to adapt previous chat completion streaming code to new interface.

For safety, we don’t stream api to front end directly, so an api gateway streaming directly to openai api and bridge the streaming from api to front end framework, to prevent leak of apikey.
If python lib can still use “for chunk in stream_resp” like implementation, it may be a little easier.
Or openai may provide other streaming mechanism, such as get a one time token for this streaming process, and front end can directly streaming from openai without the risk of passing apikey and save some network bandwidth, and allow callback or retrieve completed stream run status for backend api endpoint.

But, chat completion endpoint stills seems to be more flexible and cost efficient for now, wait it to be more ready for production use.

TimJohns · March 14, 2024, 11:16pm

LGTM. All tests pass. Currently rolling out in Prod. All canaries currently in the green.

It’s Miller Time™.*

More accurately, it’s time for me to head over to the Side Hustle Taproom in Kirkland, for a celebratory Bellevue Brewing Company Tangerine Pale Ale before I dive back into the stuff I was working on yesterday when this update rolled out. If by some serendipitous chance of fate anyone local actually sees this and runs into me out there tonight, say hello and I’ll buy up to the first five of you a beverage of your choice. Look for the guy in the black Carhartt hoodie with the MacBook Pro with a few road-trip themed stickers on the lid.

jhakulin · March 15, 2024, 4:17am

Using Python SDK, streaming text works ok, also tool calls (using functions) events happen and functions can be called, however I do not see how to get an answer from assistant after tool calls are submitted. I would expect the streaming to continue after tool calls are submitted and text would be streamed.

louis030195 · March 15, 2024, 4:20pm

This API looks like JS from the 2000

OpenAI Assistants API will not pass the test of time

kachari.bikram42 · March 15, 2024, 7:28pm

I have written a blog on this - OpenAI Assistant’s Streaming Support | by Bikram kachari | Mar, 2024 | Medium

kachari.bikram42 · March 15, 2024, 7:29pm

jhakulin · March 15, 2024, 11:10pm

OpenAI examples did not cover basically at all, how to submit tools calls and stream the answer after the function call. Eventually I used the same AssistantEventHandler for both “create_and_stream” and “submit_tool_outputs_stream” and that works.

yan.lobau · March 16, 2024, 10:03pm

Agree. I struggle to understand how to handle certain events in Node.js. In this part of the docs, they say we need to handle textCreated/toolCallCreated etc. At the same time, the complete list of events here doesn’t correspond to the code (node.js) by naming notation. I can’t understand how to translate thread.message.created into textCreated myself and how to find appropriate events for node.js that I need to handle.
Also it is impossible to understand how to handle function calls from the assistant, from the docs, using streaming. I logged all the events to understand what is going on under the hood, but I still don’t get which event should I handle to get the function name with arguments. Is it thread.run.requires_action? This is the only event that contains full info for running my func, but the event name sound weird for this purpose.

TimJohns · March 17, 2024, 4:30pm

Just my opinion, but I think about half of that (the event naming convention and missing details) is documentation that OpenAI could do a better job with, and the other half (the part @louis030195 said looks like 2000’s JavaScript) is Node’s streaming and event listener architecture.

I anticipate that OpenAI will fill in the much-needed examples and additional documentation, shortly. Alternately, as @kachari.bikram42 did (thank you), we’ll see folks in the community “fill in the blanks”. On the other part, I suspect the streaming and event architecture will be a harder problem for a while.

In theory, OpenAI could come up with a much easier to use architecture than Node/JS’s, but my own recent experience is that’s easier said than done. On Epic Road Trip Planner, we use a back-end layer and our own API abstractions in an attempt to ‘simplify’ the interface for our front-end code (as well as to more importantly secure our API key), but it turns out that even with our own custom API abstraction, the resulting front-end JavaScript code is definitely still not as clean as we’d like and, indeed, looks quite 2000s-ish as well. (Don’t worry, I’m not throwing anyone else on the team here under the bus - I personally wrote it, so I get to complain about it as much as I want - hah!).

alanc · March 18, 2024, 12:19am

I’m struggling with how to submit the tool outputs, and when (on which listener). Could you elaborate a little? I don’t quite understand the particulars of your comment.

jhakulin · March 18, 2024, 1:20am

The submit tool outputs is done from “on_tool_call_done” event handler when required_action.type is “submit_tool_outputs”.

Python implementation for that I have is here: azureai-assistant-tool/sdk/azure-ai-assistant/azure/ai/assistant/management/assistant_client.py at main · Azure-Samples/azureai-assistant-tool · GitHub

hitmandenemark · March 18, 2024, 3:42pm

Now I am struggling how to implement function calling in streaming assistant run in my NestJS project.
As you know, The example code is like this.

const run = openai.beta.threads.runs.createAndStream(thread.id, {
    assistant_id: assistant.id
  })
    .on('textCreated', (text) => process.stdout.write('\nassistant > '))
    .on('textDelta', (textDelta, snapshot) => process.stdout.write(textDelta.value))
    .on('toolCallCreated', (toolCall) => process.stdout.write(`\nassistant > ${toolCall.type}\n\n`))
    .on('toolCallDelta', (toolCallDelta, snapshot) => {
      if (toolCallDelta.type === 'code_interpreter') {
        if (toolCallDelta.code_interpreter.input) {
          process.stdout.write(toolCallDelta.code_interpreter.input);
        }
        if (toolCallDelta.code_interpreter.outputs) {
          process.stdout.write("\noutput >\n");
          toolCallDelta.code_interpreter.outputs.forEach(output => {
            if (output.type === "logs") {
              process.stdout.write(`\n${output.logs}\n`);
            }
          });
        }
      }
    });

I would like to know about event handler of nodejs SDK.
Is there anyone to know about that? especially function calling of streaming assistant?

alanc · March 18, 2024, 4:04pm

I’m in the same boat (nodejs). I had non-streaming function calls working, and was able to get streaming working for regular text responses, but now I’m stuck on function calling. I’m accumulating the argument chunks, and adding them to the object returned on the listener toolCallDone which looks like below. I assume it’s producing the object so we can use it.

{
  index: 0,
    id: 'call_I5q5dyCFtmkakKXkJACD8cBX',
      type: 'function',
        function: {
          name: 'my_function',
          arguments: '{ my_accumulated_args }',
          output: null
        }
}

Then I’m trying to run the function calls which should work, but I am a bit confused on how to run .submitToolOutputs, since before openai.beta.threads.runs.retrieve(threadId, runId) returned the threadId and runId used, but openai.beta.threads.runs.createAndStream does not seem to. I’m also unsure of how to capture requires_action like before. This change in logic is mainly what’s tripping me up. I know I’m missing something here, but it’s hard to really tell from the sparse examples. I think one complete example laying out the logic would be helpful.

alanc · March 18, 2024, 9:52pm

I am pretty much experiencing the same confusion. But, requires_action I think should be the event where you trigger the functions, since that is what it is for non-streaming. Currently I just can’t figure out how to capture requires_action…how were you able to? Like you said, the names for nodejs are not align.

Topic		Replies	Views
Using Streaming Assistants API With Websockets API assistants-api	9	890	January 21, 2025
Has anyone managed to get a tool_call working when stream=True? API api , function-calling	22	19972	May 24, 2024
[Critical] Over 25% Assistant API Request Timeout Randomly API	81	5826	March 18, 2024
Streaming from Text-to-Speech api API api , python , tts	53	52233	January 21, 2025
Multiple function calls with streaming API gpt-4 , function-calling , streaming	6	4659	April 5, 2024

Streaming is now available in the Assistants API!

Any thoughts on how to use textCreated & textDelta from official example in my client-side bubble.tsx ?

Unable to handle it in my nextjs web app. Error is:

Existing code I am trying to refactor is:

useEffect in bubble.tsx code where I think I should catch it.

Related topics

Any thoughts on how to use `textCreated` & `textDelta` from official example in my client-side bubble.tsx ?