Has anyone managed to get a tool_call working when stream=True?

Hi there! I’m new here so please forgive any poor choices etc. I have been playing around with the OpenAI API for a few months now, and this is how i previously handled function calls and streaming in python:

if chunk["choices"][0]["delta"].get("function_call"):                            
    if "name" in chunk["choices"][0]["delta"]["function_call"]:
        function_name = chunk["choices"][0]["delta"]["function_call"]["name"]
        chunk = chunk["choices"][0]["delta"]
        function_arguments_chunk = chunk["function_call"]["arguments"]
        function_arguments += function_arguments_chunk
        print(function_arguments_chunk, end='', flush=True)
        function_called = True

However, since function calls are now deprecated, I was wondering if anyone had a solution to get something like this working with the new GPT-4-1106-preview model with streaming and handling multiple tool calls?

I have deduced that a tool call is now handled as a finish_reason, however I am unsure if this is still the case while streaming a response.

I’ll have to do some more digging, but any help is appreciated!

Many thanks
:smiley:

2 Likes

I was just looking into this myself and your post popped up. According to the OpenAI OpenAPI spec, tool call chunks have an index property, which should be present on each returned chunk. This should allow for demarcation of array elements, one per tool call. Haven’t tried it yet, but hope this helps.

1 Like

I took a look at this - thanks! However this still seems to be using the old api, as functions are still mentioned. This is what my chunk.choice looks like (with stream=True):

[Choice(delta=ChoiceDelta(content='', function_call=None, role='assistant', tool_calls=None), finish_reason=None, index=0)]

Even when running with curl (without streaming)

curl https://api.openai.com/v1/chat/completions -H "Content-Type: application/json" -H "Authorization: Bearer $API_KEY"
-d '{
  "model": "gpt-4-1106-preview",
  "messages": [
    {
      "role": "user",
      "content": "What is the weather like in Boston?"
    }
  ],
  "tools": [
    {
      "type": "function",
      "function": {
          "name": "get_current_weather",
          "description": "Get the current weather in a given location",
          "parameters": {
              "type": "object",
              "properties": {
                  "location": {
                      "type": "string",
                      "description": "The city and state, e.g. San Francisco, CA",
                  },
                  "unit": {"type": "string", "enum": ["celsius", "fahrenheit"]},
              },
              "required": ["location"],
          },
      },
    }
  ],
  "stream": false
}'
OUTPUT:
{"id":"chatcmpl-8KSJXuiGM4kKM8RyWHfZ9HJrnc2uW","object":"chat.completion","created":1699886091,"model":"gpt-4-1106-preview","choices":[{"index":0,"message":{"role":"assistant","content":null,"tool_calls":[{"id":"call_3JPawcsAOmu6Kq8jtpELuFcu","type":"function","functio...

And when running curl with streaming, I get no response:

curl https://api.openai.com/v1/chat/completions -H "Content-Type: application/json" -H "Authorization: Bearer $API_KEY"
 -d '{
  "model": "gpt-4-1106-preview",
  "messages": [
    {
      "role": "user",
      "content": "What is the weather like in Boston?"
    }
  ],
  "tools": [
    {
      "type": "function",
      "function": {
          "name": "get_current_weather",
          "description": "Get the current weather in a given location",
          "parameters": {
              "type": "object",
              "properties": {
                  "location": {
                      "type": "string",
                      "description": "The city and state, e.g. San Francisco, CA",
                  },
                  "unit": {"type": "string", "enum": ["celsius", "fahrenheit"]},
              },
              "required": ["location"],
          },
      },
    }
  ],
  "stream": true
}'
gpu@gpu-server:~$ [NO RESPONSE]

So it could be that tool_calls aren’t supported yet with streaming.

Unless someone can try the curl with streaming on and post their output to see if it’s an issue on my end, I guess I’ll just wait and see what happens.

1 Like

@Xeniox : I also performed some experiments and agree with your assessment. Posted a bug report here for tracking by OpenAI.

1 Like

Are you still having this problem ? I may have a workaround

1 Like
1 Like

@Cristhiandcl8 : If you have a workaround, please post it. The original poster had no luck with curl and I had no luck with the new Python SDK, built from the OpenAPI spec with Stainless. With curl, I thought there could be a problem with buffering the server-sent events (possibly mitigated by --no-buffer), but that seems to not be the case.

1 Like

I saw that documentation on the Assistants API limitation as part of the release notes. We’re talking about the standard Chat Completions API here. While I believe that it is quite possible that the Assistants API is built on top of the Chat Completions API, I don’t think we can infer that Chat Completions does not support streaming of tool calls from that.

2 Likes

I tested streaming as response from the other post yesterday. Just follow the raw output on what to expect. It may help guide you to adjust your code.

1 Like

I got tool calls working on streaming calls to chatcompletion last weekend. It was a pain, as it’s not documented, and it’s unclear what’s happening when you dump the chunks to text. Plus there may be multiple tool/function calls in a single response now. It’s doable, though, and it does work. The relevant snippet is this:

tool_calls = []

# build up the response structs from the streamed response, simultaneously sending message chunks to the browser
for chunk in response:
    delta = chunk.choices[0].delta
    #app.logger.info(f"chunk: {delta}")

    if delta and delta.content:
        # content chunk -- send to browser and record for later saving
        socket.send(json.dumps({'type': 'message response', 'text': delta.content }))
        newsessionrecord["content"] += delta.content

    elif delta and delta.tool_calls:
        tcchunklist = delta.tool_calls
        for tcchunk in tcchunklist:
            if len(tool_calls) <= tcchunk.index:
                tool_calls.append({"id": "", "type": "function", "function": { "name": "", "arguments": "" } })
            tc = tool_calls[tcchunk.index]

            if tcchunk.id:
                tc["id"] += tcchunk.id
            if tcchunk.function.name:
                tc["function"]["name"] += tcchunk.function.name
            if tcchunk.function.arguments:
                tc["function"]["arguments"] += tcchunk.function.arguments
5 Likes

Discovered what was happening in my case (OpenAI Python SDK 1.3). With the previous streaming implementation for function or content, you were always able to determine the type of response in the first chunk. With tool calls, the first chunk actually can have content, function, and tool_calls all set to None and so you have to sniff multiple chunks from the response before you can determine what kind of response you are accumulating.

There is additional caveat that a tool call chunk delta always presents an array of length 1 which contains an object that has the index inside of it. This was non-obvious from looking at the OpenAPI spec. Here’s an example of such a delta:

         delta: ChoiceDelta(
                    content=None,
                    function_call=None,
                    role=None,
                    tool_calls=[
                        ChoiceDeltaToolCall(
                            index=0,
                            id='call_uGViZDuQa8pAApH3NnMC9TX9',
                            function=ChoiceDeltaToolCallFunction(arguments='', name='read'),
                            type='function'
                        )
                    ]
                )

Here’s my implementation with the new Python SDK (handling the legacy function calls really should be separate logic, but…):

    from collections import defaultdict
    tool_calls = [ ]
    index = 0
    start = True
    for chunk in response:
        delta = chunk.choices[ 0 ].delta
        if not delta: break
        if not delta.function_call and not delta.tool_calls:
            if start: continue
            else: break
        start = False
        if delta.function_call:
            if index == len( tool_calls ):
                tool_calls.append( defaultdict( str ) )
            if delta.function_call.name:
                tool_calls[ index ][ 'name' ] = delta.function_call.name
            if delta.function_call.arguments:
                tool_calls[ index ][ 'arguments' ] += (
                    delta.function_call.arguments )
        elif delta.tool_calls:
            tool_call = delta.tool_calls[ 0 ]
            index = tool_call.index
            if index == len( tool_calls ):
                tool_calls.append( defaultdict( str ) )
            if tool_call.id:
                tool_calls[ index ][ 'id' ] = tool_call.id
            if tool_call.function:
                if tool_call.function.name:
                    tool_calls[ index ][ 'name' ] = tool_call.function.name
                if tool_call.function.arguments:
                    tool_calls[ index ][ 'arguments' ] += (
                        tool_call.function.arguments )

Hope this helps.

2 Likes

This is how I did it

recovered_pieces = {
                         "content": None,
                         "role": "assistant",
                         "tool_calls": {}
                       }

for chunk in response:
        delta = chunk.choices[0].delta
        if delta.content is None:
            if delta.tool_calls:
                piece = delta.tool_calls[0]
                recovered_pieces["tool_calls"][piece.index] = recovered_pieces["tool_calls"].get(piece.index, {"id": None,  "function": {"arguments":"",  "name": ""},  "type": "function"})
                if piece.id:
                    recovered_pieces["tool_calls"][piece.index]["id"] = piece.id
                if piece.function.name:
                    recovered_pieces["tool_calls"][piece.index]["function"]["name"] = piece.function.name
                recovered_pieces["tool_calls"][piece.index]["function"]["arguments"] += piece.function.arguments   
            
        else:
            yield delta.content
2 Likes