Auto tool call streaming differentiation is unintuitive

I am doing recursive tool calling with tool_choice: auto set. By recursive tool calling I mean an input of a tool is an output of the previous tool, and tool call request continues in a loop until content is returned. I need the final content response to be streamed, and since I don’t know when will I get a content response, intermediary tool calls must also be handled from a stream as well.

Working on this, I realized couple of points that might use enhancement in the stream API:

  1. This usecase might be a bit edge-case, but it would be great if somehow I was able to get tool calls as full response, and then the text content response as stream.
  2. While handling the SSE data, I need to first detect whether this response is going to be a tool call stream, or content stream. I currently do this by checking the content field in the first event data. If the content is null, it is a tool call, if content is empty string, it is the final content response stream. Also there is this weird (and I think totally unnecessary) where if the total tool call count is 1, the first tool call is embedded within the initial event data, but if it is more than 1, than it comes in the next event. Very weird behavior IMO. What I suggest here is that, null vs empty content is very unintuitive and I am not even 100% sure if that is correct, although it worked correctly so far. Either always having the first tool call in the first event data, or having another field indicating that this stream is going to be a tool call stream would of extreme help.

Read about the events you can employ, such as ContentDeltaEvent for an output display of content for viewing, vs FunctionToolCallArgumentsDoneEvent for a collected response, when using the chat completions streaming helper

Thanks for linking a documentation as I wasn’t able find a proper documentation so far, even though I don’t use a library for this, it is helpful.

One thing is that, the only event object I am receiving is chat.completion.chunk, and I have never received anything else so far. Other thing is, even if I were to receive these listed events exactly, it is still not possible to understand whether this stream is going to be tool call or content stream from the beginning of the stream. Per your suggestion, FunctionToolCallArgumentsDoneEvent would be sent at the end of the stream (or at the end of per tool call stream not sure), while I want to discriminate in the first event as if it was a header information.

Again I was able to do this via is content null or empty string check, but that just feel so unintuitive and maybe I am completely missing something.

If first event is like this it is content streaming:

data: {"id":"chatcmpl-xxxx","object":"chat.completion.chunk","created":1747646516,"model":"gpt-4.1-2025-04-14","service_tier":"default","system_fingerprint":"fp_xxxx","choices":[{"index":0,"delta":{"role":"assistant","content":"","refusal":null},"logprobs":null,"finish_reason":null}]}

If the first event is like this it is tool call streaming with multiple tool calls:

data: {"id":"chatcmpl-xxxxx","object":"chat.completion.chunk","created":1747647015,"model":"gpt-4.1-2025-04-14","service_tier":"default","system_fingerprint":"fp_xxxxx","choices":[{"index":0,"delta":{"role":"assistant","content":null},"logprobs":null,"finish_reason":null}]}

If the first event is like this it is again tool call streaming but with exactly one tool call:

data: {"id":"chatcmpl-xxxx","object":"chat.completion.chunk","created":1747647091,"model":"gpt-4.1-2025-04-14","service_tier":"default","system_fingerprint":"fp_xxxxx","choices":[{"index":0,"delta":{"role":"assistant","content":null,"tool_calls":[{"index":0,"id":"call_jJWRMgq5wBeA10exj00NSbYR","type":"function","function":{"name":"get_weather","arguments":""}}],"refusal":null},"logprobs":null,"finish_reason":null}]}

That’s because the chat completions helper in the openai library is what would collect the stream and then would be emitting the event-like interface and also has a gatherer to produce a final event.

It is a helper with a pattern just as a programmer would use themselves if they desire more versatility and would interact with the stream logically.

For example, in your code, one might:

  • immediately relay output to be user-seen
  • build and store any tool output received.
reply=""
tools=[]
for chunk in c.parse():
    print(chunk.choices[0].delta)
    if chunk.choices[0].delta.content:
        reply += chunk.choices[0].delta.content        # gather for chat history
        print(chunk.choices[0].delta.content, end="")  # your output method
    if chunk.choices[0].delta.tool_calls:
        tools += chunk.choices[0].delta.tool_calls     # gather ChoiceDeltaToolCall list chunks

tools_obj = tool_list_to_tool_obj(tools)
print(reply)
print(tools_obj)

And then instead of asyncronously processing parallel calls as they are received, use the response finalization as a signal to proceed to the next step…

def tool_list_to_tool_obj(tools):
    # Initialize a dictionary with default values
    tool_calls_dict = defaultdict(lambda: {"id": None, "function": {"arguments": "", "name": None}, "type": None})

    # Iterate over the tool calls
    for tool_call in tools:
    ...

If you wanted the stream itself to be a series of events, from over a dozen event possibilities, you’d use the newer Responses endpoint.

Then you’d get a SSE stream that looks like this for a simple response:

event: response.created
data: {"type":"response.created","response":{"id":"resp_234567","object":"response","created_at":1747655619,"status":"in_progress","error":null,"incomplete_details":null,"instructions":"You are an AI with computer vision","max_output_tokens":1000,"model":"gpt-4.1-mini-2025-04-14","output":[],"parallel_tool_calls":true,"previous_response_id":null,"reasoning":{"effort":null,"summary":null},"service_tier":"auto","store":false,"temperature":0.001,"text":{"format":{"type":"text"}},"tool_choice":"auto","tools":[],"top_p":0.001,"truncation":"auto","usage":null,"user":null,"metadata":{}}}

event: response.in_progress
data: {"type":"response.in_progress","response":{"id":"resp_234567","object":"response","created_at":1747655619,"status":"in_progress","error":null,"incomplete_details":null,"instructions":"You are an AI with computer vision","max_output_tokens":1000,"model":"gpt-4.1-mini-2025-04-14","output":[],"parallel_tool_calls":true,"previous_response_id":null,"reasoning":{"effort":null,"summary":null},"service_tier":"auto","store":false,"temperature":0.001,"text":{"format":{"type":"text"}},"tool_choice":"auto","tools":[],"top_p":0.001,"truncation":"auto","usage":null,"user":null,"metadata":{}}}

event: response.output_item.added
data: {"type":"response.output_item.added","output_index":0,"item":{"id":"msg-123456","type":"message","status":"in_progress","content":[],"role":"assistant"}}

event: response.content_part.added
data: {"type":"response.content_part.added","item_id":"msg-123456","output_index":0,"content_index":0,"part":{"type":"output_text","annotations":[],"text":""}}

event: response.output_text.delta
data: {"type":"response.output_text.delta","item_id":"msg-123456","output_index":0,"content_index":0,"delta":"Under"}

event: response.output_text.delta
data: {"type":"response.output_text.delta","item_id":"msg-123456","output_index":0,"content_index":0,"delta":"stood"}

event: response.output_text.delta
data: {"type":"response.output_text.delta","item_id":"msg-123456","output_index":0,"content_index":0,"delta":"."}

event: response.output_text.delta
data: {"type":"response.output_text.delta","item_id":"msg-123456","output_index":0,"content_index":0,"delta":" I"}

event: response.output_text.delta
data: {"type":"response.output_text.delta","item_id":"msg-123456","output_index":0,"content_index":0,"delta":" will"}

event: response.output_text.delta
data: {"type":"response.output_text.delta","item_id":"msg-123456","output_index":0,"content_index":0,"delta":" write"}

event: response.output_text.delta
data: {"type":"response.output_text.delta","item_id":"msg-123456","output_index":0,"content_index":0,"delta":" clearly"}

event: response.output_text.delta
data: {"type":"response.output_text.delta","item_id":"msg-123456","output_index":0,"content_index":0,"delta":" and"}

event: response.output_text.delta
data: {"type":"response.output_text.delta","item_id":"msg-123456","output_index":0,"content_index":0,"delta":" briefly"}

event: response.output_text.delta
data: {"type":"response.output_text.delta","item_id":"msg-123456","output_index":0,"content_index":0,"delta":"."}

event: response.output_text.delta
data: {"type":"response.output_text.delta","item_id":"msg-123456","output_index":0,"content_index":0,"delta":" How"}

event: response.output_text.delta
data: {"type":"response.output_text.delta","item_id":"msg-123456","output_index":0,"content_index":0,"delta":" can"}

event: response.output_text.delta
data: {"type":"response.output_text.delta","item_id":"msg-123456","output_index":0,"content_index":0,"delta":" I"}

event: response.output_text.delta
data: {"type":"response.output_text.delta","item_id":"msg-123456","output_index":0,"content_index":0,"delta":" assist"}

event: response.output_text.delta
data: {"type":"response.output_text.delta","item_id":"msg-123456","output_index":0,"content_index":0,"delta":" you"}

event: response.output_text.delta
data: {"type":"response.output_text.delta","item_id":"msg-123456","output_index":0,"content_index":0,"delta":"?"}

event: response.output_text.done
data: {"type":"response.output_text.done","item_id":"msg-123456","output_index":0,"content_index":0,"text":"Understood. I will write clearly and briefly. How can I assist you?"}

event: response.content_part.done
data: {"type":"response.content_part.done","item_id":"msg-123456","output_index":0,"content_index":0,"part":{"type":"output_text","annotations":[],"text":"Understood. I will write clearly and briefly. How can I assist you?"}}

event: response.output_item.done
data: {"type":"response.output_item.done","output_index":0,"item":{"id":"msg-123456","type":"message","status":"completed","content":[{"type":"output_text","annotations":[],"text":"Understood. I will write clearly and briefly. How can I assist you?"}],"role":"assistant"}}

event: response.completed
data: {"type":"response.completed","response":{"id":"resp_234567","object":"response","created_at":1747655619,"status":"completed","error":null,"incomplete_details":null,"instructions":"You are an AI with computer vision","max_output_tokens":1000,"model":"gpt-4.1-mini-2025-04-14","output":[{"id":"msg-123456","type":"message","status":"completed","content":[{"type":"output_text","annotations":[],"text":"Understood. I will write clearly and briefly. How can I assist you?"}],"role":"assistant"}],"parallel_tool_calls":true,"previous_response_id":null,"reasoning":{"effort":null,"summary":null},"service_tier":"default","store":false,"temperature":0.001,"text":{"format":{"type":"text"}},"tool_choice":"auto","tools":[],"top_p":0.001,"truncation":"auto","usage":{"input_tokens":30,"input_tokens_details":{"cached_tokens":0},"output_tokens":17,"output_tokens_details":{"reasoning_tokens":0},"total_tokens":47},"user":null,"metadata":{}}}

Or for a parallel tool call along with a message for a user…multiple output indexes and deltas, multiple response.output_item.done events, but then a final response.completed event that has a more familiar object. Which I will leave here in “super-long-line” form.

event: response.created
data: {"type":"response.created","response":{"id":"resp_1234567","object":"response","created_at":1747656088,"status":"in_progress","error":null,"incomplete_details":null,"instructions":"You are an AI with computer vision","max_output_tokens":1000,"model":"gpt-4.1-mini-2025-04-14","output":[],"parallel_tool_calls":true,"previous_response_id":null,"reasoning":{"effort":null,"summary":null},"service_tier":"auto","store":false,"temperature":0.001,"text":{"format":{"type":"text"}},"tool_choice":"auto","tools":[{"type":"function","description":"Get current weather information for a location.\n requires informing user of intention, then you automatically call after informing.","name":"get_weather","parameters":{"type":"object","properties":{"location":{"type":"string","description":"City name or geographic coordinates"}},"required":["location"]},"strict":true}],"top_p":0.001,"truncation":"auto","usage":null,"user":null,"metadata":{}}}

event: response.in_progress
data: {"type":"response.in_progress","response":{"id":"resp_1234567","object":"response","created_at":1747656088,"status":"in_progress","error":null,"incomplete_details":null,"instructions":"You are an AI with computer vision","max_output_tokens":1000,"model":"gpt-4.1-mini-2025-04-14","output":[],"parallel_tool_calls":true,"previous_response_id":null,"reasoning":{"effort":null,"summary":null},"service_tier":"auto","store":false,"temperature":0.001,"text":{"format":{"type":"text"}},"tool_choice":"auto","tools":[{"type":"function","description":"Get current weather information for a location.\n requires informing user of intention, then you automatically call after informing.","name":"get_weather","parameters":{"type":"object","properties":{"location":{"type":"string","description":"City name or geographic coordinates"}},"required":["location"]},"strict":true}],"top_p":0.001,"truncation":"auto","usage":null,"user":null,"metadata":{}}}

event: response.output_item.added
data: {"type":"response.output_item.added","output_index":0,"item":{"id":"msg_345678","type":"message","status":"in_progress","content":[],"role":"assistant"}}

event: response.content_part.added
data: {"type":"response.content_part.added","item_id":"msg_345678","output_index":0,"content_index":0,"part":{"type":"output_text","annotations":[],"text":""}}

event: response.output_text.delta
data: {"type":"response.output_text.delta","item_id":"msg_345678","output_index":0,"content_index":0,"delta":"I"}

event: response.output_text.delta
data: {"type":"response.output_text.delta","item_id":"msg_345678","output_index":0,"content_index":0,"delta":" will"}

event: response.output_text.delta
data: {"type":"response.output_text.delta","item_id":"msg_345678","output_index":0,"content_index":0,"delta":" check"}

event: response.output_text.delta
data: {"type":"response.output_text.delta","item_id":"msg_345678","output_index":0,"content_index":0,"delta":" the"}

event: response.output_text.delta
data: {"type":"response.output_text.delta","item_id":"msg_345678","output_index":0,"content_index":0,"delta":" current"}

event: response.output_text.delta
data: {"type":"response.output_text.delta","item_id":"msg_345678","output_index":0,"content_index":0,"delta":" weather"}

event: response.output_text.delta
data: {"type":"response.output_text.delta","item_id":"msg_345678","output_index":0,"content_index":0,"delta":" in"}

event: response.output_text.delta
data: {"type":"response.output_text.delta","item_id":"msg_345678","output_index":0,"content_index":0,"delta":" both"}

event: response.output_text.delta
data: {"type":"response.output_text.delta","item_id":"msg_345678","output_index":0,"content_index":0,"delta":" Los"}

event: response.output_text.delta
data: {"type":"response.output_text.delta","item_id":"msg_345678","output_index":0,"content_index":0,"delta":" Angeles"}

event: response.output_text.delta
data: {"type":"response.output_text.delta","item_id":"msg_345678","output_index":0,"content_index":0,"delta":" and"}

event: response.output_text.delta
data: {"type":"response.output_text.delta","item_id":"msg_345678","output_index":0,"content_index":0,"delta":" San"}

event: response.output_text.delta
data: {"type":"response.output_text.delta","item_id":"msg_345678","output_index":0,"content_index":0,"delta":" Francisco"}

event: response.output_text.delta
data: {"type":"response.output_text.delta","item_id":"msg_345678","output_index":0,"content_index":0,"delta":" for"}

event: response.output_text.delta
data: {"type":"response.output_text.delta","item_id":"msg_345678","output_index":0,"content_index":0,"delta":" you"}

event: response.output_text.delta
data: {"type":"response.output_text.delta","item_id":"msg_345678","output_index":0,"content_index":0,"delta":"."}

event: response.output_text.done
data: {"type":"response.output_text.done","item_id":"msg_345678","output_index":0,"content_index":0,"text":"I will check the current weather in both Los Angeles and San Francisco for you."}

event: response.content_part.done
data: {"type":"response.content_part.done","item_id":"msg_345678","output_index":0,"content_index":0,"part":{"type":"output_text","annotations":[],"text":"I will check the current weather in both Los Angeles and San Francisco for you."}}

event: response.output_item.done
data: {"type":"response.output_item.done","output_index":0,"item":{"id":"msg_345678","type":"message","status":"completed","content":[{"type":"output_text","annotations":[],"text":"I will check the current weather in both Los Angeles and San Francisco for you."}],"role":"assistant"}}

event: response.output_item.added
data: {"type":"response.output_item.added","output_index":1,"item":{"id":"fc_682b1d9af724819883a2aa4ff79d53a10f8c2a8a5d902139","type":"function_call","status":"in_progress","arguments":"","call_id":"call_X8Xf9LPRRccFURA0Tw4bSlfK","name":"get_weather"}}

event: response.function_call_arguments.delta
data: {"type":"response.function_call_arguments.delta","item_id":"fc_682b1d9af724819883a2aa4ff79d53a10f8c2a8a5d902139","output_index":1,"delta":"{"}

event: response.function_call_arguments.delta
data: {"type":"response.function_call_arguments.delta","item_id":"fc_682b1d9af724819883a2aa4ff79d53a10f8c2a8a5d902139","output_index":1,"delta":"\"location"}

event: response.function_call_arguments.delta
data: {"type":"response.function_call_arguments.delta","item_id":"fc_682b1d9af724819883a2aa4ff79d53a10f8c2a8a5d902139","output_index":1,"delta":"\":"}

event: response.function_call_arguments.delta
data: {"type":"response.function_call_arguments.delta","item_id":"fc_682b1d9af724819883a2aa4ff79d53a10f8c2a8a5d902139","output_index":1,"delta":"\"Los"}

event: response.function_call_arguments.delta
data: {"type":"response.function_call_arguments.delta","item_id":"fc_682b1d9af724819883a2aa4ff79d53a10f8c2a8a5d902139","output_index":1,"delta":" Angeles"}

event: response.function_call_arguments.delta
data: {"type":"response.function_call_arguments.delta","item_id":"fc_682b1d9af724819883a2aa4ff79d53a10f8c2a8a5d902139","output_index":1,"delta":"\"}"}

event: response.function_call_arguments.done
data: {"type":"response.function_call_arguments.done","item_id":"fc_682b1d9af724819883a2aa4ff79d53a10f8c2a8a5d902139","output_index":1,"arguments":"{\"location\":\"Los Angeles\"}"}

event: response.output_item.done
data: {"type":"response.output_item.done","output_index":1,"item":{"id":"fc_682b1d9af724819883a2aa4ff79d53a10f8c2a8a5d902139","type":"function_call","status":"completed","arguments":"{\"location\":\"Los Angeles\"}","call_id":"call_X8Xf9LPRRccFURA0Tw4bSlfK","name":"get_weather"}}

event: response.output_item.added
data: {"type":"response.output_item.added","output_index":2,"item":{"id":"fc_682b1d9b5484819881ff3393a62c4db40f8c2a8a5d902139","type":"function_call","status":"in_progress","arguments":"","call_id":"call_JEjHOGbZbo3FjYUBSQkjdfjF","name":"get_weather"}}

event: response.function_call_arguments.delta
data: {"type":"response.function_call_arguments.delta","item_id":"fc_682b1d9b5484819881ff3393a62c4db40f8c2a8a5d902139","output_index":2,"delta":"{"}

event: response.function_call_arguments.delta
data: {"type":"response.function_call_arguments.delta","item_id":"fc_682b1d9b5484819881ff3393a62c4db40f8c2a8a5d902139","output_index":2,"delta":"\"location"}

event: response.function_call_arguments.delta
data: {"type":"response.function_call_arguments.delta","item_id":"fc_682b1d9b5484819881ff3393a62c4db40f8c2a8a5d902139","output_index":2,"delta":"\":"}

event: response.function_call_arguments.delta
data: {"type":"response.function_call_arguments.delta","item_id":"fc_682b1d9b5484819881ff3393a62c4db40f8c2a8a5d902139","output_index":2,"delta":"\"San"}

event: response.function_call_arguments.delta
data: {"type":"response.function_call_arguments.delta","item_id":"fc_682b1d9b5484819881ff3393a62c4db40f8c2a8a5d902139","output_index":2,"delta":" Francisco"}

event: response.function_call_arguments.delta
data: {"type":"response.function_call_arguments.delta","item_id":"fc_682b1d9b5484819881ff3393a62c4db40f8c2a8a5d902139","output_index":2,"delta":"\"}"}

event: response.function_call_arguments.done
data: {"type":"response.function_call_arguments.done","item_id":"fc_682b1d9b5484819881ff3393a62c4db40f8c2a8a5d902139","output_index":2,"arguments":"{\"location\":\"San Francisco\"}"}

event: response.output_item.done
data: {"type":"response.output_item.done","output_index":2,"item":{"id":"fc_682b1d9b5484819881ff3393a62c4db40f8c2a8a5d902139","type":"function_call","status":"completed","arguments":"{\"location\":\"San Francisco\"}","call_id":"call_JEjHOGbZbo3FjYUBSQkjdfjF","name":"get_weather"}}

event: response.completed
data: {"type":"response.completed","response":{"id":"resp_1234567","object":"response","created_at":1747656088,"status":"completed","error":null,"incomplete_details":null,"instructions":"You are an AI with computer vision","max_output_tokens":1000,"model":"gpt-4.1-mini-2025-04-14","output":[{"id":"msg_345678","type":"message","status":"completed","content":[{"type":"output_text","annotations":[],"text":"I will check the current weather in both Los Angeles and San Francisco for you."}],"role":"assistant"},{"id":"fc_682b1d9af724819883a2aa4ff79d53a10f8c2a8a5d902139","type":"function_call","status":"completed","arguments":"{\"location\":\"Los Angeles\"}","call_id":"call_X8Xf9LPRRccFURA0Tw4bSlfK","name":"get_weather"},{"id":"fc_682b1d9b5484819881ff3393a62c4db40f8c2a8a5d902139","type":"function_call","status":"completed","arguments":"{\"location\":\"San Francisco\"}","call_id":"call_JEjHOGbZbo3FjYUBSQkjdfjF","name":"get_weather"}],"parallel_tool_calls":true,"previous_response_id":null,"reasoning":{"effort":null,"summary":null},"service_tier":"default","store":false,"temperature":0.001,"text":{"format":{"type":"text"}},"tool_choice":"auto","tools":[{"type":"function","description":"Get current weather information for a location.\n requires informing user of intention, then you automatically call after informing.","name":"get_weather","parameters":{"type":"object","properties":{"location":{"type":"string","description":"City name or geographic coordinates"}},"required":["location"]},"strict":true}],"top_p":0.001,"truncation":"auto","usage":{"input_tokens":84,"input_tokens_details":{"cached_tokens":0},"output_tokens":18,"output_tokens_details":{"reasoning_tokens":0},"total_tokens":102},"user":null,"metadata":{}}}

The Responses endpoint is also documented, and less “helper” for you in provided libraries.