Tool call streaming doesn't stream

When making a streaming call with tools, the entire message ques, and is then returned all at once.

For example given this code:

import datetime
from openai import OpenAI

# Initialize OpenAI client
client = OpenAI()
print(datetime.datetime.now(), "start")
response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "system", "content": "call the fake_tool five times with different parameters"}],
    tools=[
        {
            "type": "function",
            "function": {
                "name": "fake_tool",
                "description": "A fake tool for testing purposes.",
                "parameters": {
                    "type": "object",
                    "properties": {
                        "param1": {
                            "type": "string",
                            "description": "A fake parameter."
                        }
                    },
                    "required": ["param1"],
                    "additionalProperties": False
                },
                "strict": True
            }
        }
    ],
    tool_choice="required",
    stream=True,
    temperature=0.0
)
for chunk in response:
    print(datetime.datetime.now(), chunk)
print(datetime.datetime.now(), "end")

The expectation is chunks will start showing up after ~500ms and come in every ~25ms. Instead they show up all at once after ~3s. In other words, it’s not actually streaming

2024-10-31 22:58:05.848397 start
2024-10-31 22:58:08.012473 ChatCompletionChunk(id='chatcmpl-AOYP84xNJx6UN7Zb3RuXY3kqtV27g', choices=[Choice(delta=ChoiceDelta(content=None, function_call=None, role='assistant', tool_calls=None), finish_reason=None, index=0, logprobs=None)], created=1730415486, model='gpt-4o-2024-08-06', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_159d8341cc', usage=None)
2024-10-31 22:58:08.015622 ChatCompletionChunk(id='chatcmpl-AOYP84xNJx6UN7Zb3RuXY3kqtV27g', choices=[Choice(delta=ChoiceDelta(content=None, function_call=None, role=None, tool_calls=[ChoiceDeltaToolCall(index=0, id='call_Gejo56IH11JWe2gxEbEEaS4i', function=ChoiceDeltaToolCallFunction(arguments='', name='fake_tool'), type='function')]), finish_reason=None, index=0, logprobs=None)], created=1730415486, model='gpt-4o-2024-08-06', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_159d8341cc', usage=None)
2024-10-31 22:58:08.015965 ChatCompletionChunk(id='chatcmpl-AOYP84xNJx6UN7Zb3RuXY3kqtV27g', choices=[Choice(delta=ChoiceDelta(content=None, function_call=None, role=None, tool_calls=[ChoiceDeltaToolCall(index=0, id=None, function=ChoiceDeltaToolCallFunction(arguments='{"pa', name=None), type=None)]), finish_reason=None, index=0, logprobs=None)], created=1730415486, model='gpt-4o-2024-08-06', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_159d8341cc', usage=None)
2024-10-31 22:58:08.016268 ChatCompletionChunk(id='chatcmpl-AOYP84xNJx6UN7Zb3RuXY3kqtV27g', choices=[Choice(delta=ChoiceDelta(content=None, function_call=None, role=None, tool_calls=[ChoiceDeltaToolCall(index=0, id=None, function=ChoiceDeltaToolCallFunction(arguments='ram1"', name=None), type=None)]), finish_reason=None, index=0, logprobs=None)], created=1730415486, model='gpt-4o-2024-08-06', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_159d8341cc', usage=None)
2024-10-31 22:58:08.016565 ChatCompletionChunk(id='chatcmpl-AOYP84xNJx6UN7Zb3RuXY3kqtV27g', choices=[Choice(delta=ChoiceDelta(content=None, function_call=None, role=None, tool_calls=[ChoiceDeltaToolCall(index=0, id=None, function=ChoiceDeltaToolCallFunction(arguments=': "tes', name=None), type=None)]), finish_reason=None, index=0, logprobs=None)], created=1730415486, model='gpt-4o-2024-08-06', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_159d8341cc', usage=None)
2024-10-31 22:58:08.016872 ChatCompletionChunk(id='chatcmpl-AOYP84xNJx6UN7Zb3RuXY3kqtV27g', choices=[Choice(delta=ChoiceDelta(content=None, function_call=None, role=None, tool_calls=[ChoiceDeltaToolCall(index=0, id=None, function=ChoiceDeltaToolCallFunction(arguments='t1"}', name=None), type=None)]), finish_reason=None, index=0, logprobs=None)], created=1730415486, model='gpt-4o-2024-08-06', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_159d8341cc', usage=None)
2024-10-31 22:58:08.017161 ChatCompletionChunk(id='chatcmpl-AOYP84xNJx6UN7Zb3RuXY3kqtV27g', choices=[Choice(delta=ChoiceDelta(content=None, function_call=None, role=None, tool_calls=[ChoiceDeltaToolCall(index=1, id='call_TgEeqWKMZJDBfpgcNWcmSGdL', function=ChoiceDeltaToolCallFunction(arguments='', name='fake_tool'), type='function')]), finish_reason=None, index=0, logprobs=None)], created=1730415486, model='gpt-4o-2024-08-06', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_159d8341cc', usage=None)
2024-10-31 22:58:08.017452 ChatCompletionChunk(id='chatcmpl-AOYP84xNJx6UN7Zb3RuXY3kqtV27g', choices=[Choice(delta=ChoiceDelta(content=None, function_call=None, role=None, tool_calls=[ChoiceDeltaToolCall(index=1, id=None, function=ChoiceDeltaToolCallFunction(arguments='{"pa', name=None), type=None)]), finish_reason=None, index=0, logprobs=None)], created=1730415486, model='gpt-4o-2024-08-06', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_159d8341cc', usage=None)
2024-10-31 22:58:08.017752 ChatCompletionChunk(id='chatcmpl-AOYP84xNJx6UN7Zb3RuXY3kqtV27g', choices=[Choice(delta=ChoiceDelta(content=None, function_call=None, role=None, tool_calls=[ChoiceDeltaToolCall(index=1, id=None, function=ChoiceDeltaToolCallFunction(arguments='ram1"', name=None), type=None)]), finish_reason=None, index=0, logprobs=None)], created=1730415486, model='gpt-4o-2024-08-06', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_159d8341cc', usage=None)
2024-10-31 22:58:08.018032 ChatCompletionChunk(id='chatcmpl-AOYP84xNJx6UN7Zb3RuXY3kqtV27g', choices=[Choice(delta=ChoiceDelta(content=None, function_call=None, role=None, tool_calls=[ChoiceDeltaToolCall(index=1, id=None, function=ChoiceDeltaToolCallFunction(arguments=': "tes', name=None), type=None)]), finish_reason=None, index=0, logprobs=None)], created=1730415486, model='gpt-4o-2024-08-06', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_159d8341cc', usage=None)
2024-10-31 22:58:08.018311 ChatCompletionChunk(id='chatcmpl-AOYP84xNJx6UN7Zb3RuXY3kqtV27g', choices=[Choice(delta=ChoiceDelta(content=None, function_call=None, role=None, tool_calls=[ChoiceDeltaToolCall(index=1, id=None, function=ChoiceDeltaToolCallFunction(arguments='t2"}', name=None), type=None)]), finish_reason=None, index=0, logprobs=None)], created=1730415486, model='gpt-4o-2024-08-06', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_159d8341cc', usage=None)
2024-10-31 22:58:08.018825 ChatCompletionChunk(id='chatcmpl-AOYP84xNJx6UN7Zb3RuXY3kqtV27g', choices=[Choice(delta=ChoiceDelta(content=None, function_call=None, role=None, tool_calls=[ChoiceDeltaToolCall(index=2, id='call_GK7d6ILGxfUFkJam3Z5gXFCS', function=ChoiceDeltaToolCallFunction(arguments='', name='fake_tool'), type='function')]), finish_reason=None, index=0, logprobs=None)], created=1730415486, model='gpt-4o-2024-08-06', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_159d8341cc', usage=None)
2024-10-31 22:58:08.019126 ChatCompletionChunk(id='chatcmpl-AOYP84xNJx6UN7Zb3RuXY3kqtV27g', choices=[Choice(delta=ChoiceDelta(content=None, function_call=None, role=None, tool_calls=[ChoiceDeltaToolCall(index=2, id=None, function=ChoiceDeltaToolCallFunction(arguments='{"pa', name=None), type=None)]), finish_reason=None, index=0, logprobs=None)], created=1730415486, model='gpt-4o-2024-08-06', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_159d8341cc', usage=None)
2024-10-31 22:58:08.019459 ChatCompletionChunk(id='chatcmpl-AOYP84xNJx6UN7Zb3RuXY3kqtV27g', choices=[Choice(delta=ChoiceDelta(content=None, function_call=None, role=None, tool_calls=[ChoiceDeltaToolCall(index=2, id=None, function=ChoiceDeltaToolCallFunction(arguments='ram1"', name=None), type=None)]), finish_reason=None, index=0, logprobs=None)], created=1730415486, model='gpt-4o-2024-08-06', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_159d8341cc', usage=None)
2024-10-31 22:58:08.019802 ChatCompletionChunk(id='chatcmpl-AOYP84xNJx6UN7Zb3RuXY3kqtV27g', choices=[Choice(delta=ChoiceDelta(content=None, function_call=None, role=None, tool_calls=[ChoiceDeltaToolCall(index=2, id=None, function=ChoiceDeltaToolCallFunction(arguments=': "tes', name=None), type=None)]), finish_reason=None, index=0, logprobs=None)], created=1730415486, model='gpt-4o-2024-08-06', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_159d8341cc', usage=None)
2024-10-31 22:58:08.020091 ChatCompletionChunk(id='chatcmpl-AOYP84xNJx6UN7Zb3RuXY3kqtV27g', choices=[Choice(delta=ChoiceDelta(content=None, function_call=None, role=None, tool_calls=[ChoiceDeltaToolCall(index=2, id=None, function=ChoiceDeltaToolCallFunction(arguments='t3"}', name=None), type=None)]), finish_reason=None, index=0, logprobs=None)], created=1730415486, model='gpt-4o-2024-08-06', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_159d8341cc', usage=None)
2024-10-31 22:58:08.020378 ChatCompletionChunk(id='chatcmpl-AOYP84xNJx6UN7Zb3RuXY3kqtV27g', choices=[Choice(delta=ChoiceDelta(content=None, function_call=None, role=None, tool_calls=[ChoiceDeltaToolCall(index=3, id='call_NVIElny7etmsZn0iLdZMoDwF', function=ChoiceDeltaToolCallFunction(arguments='', name='fake_tool'), type='function')]), finish_reason=None, index=0, logprobs=None)], created=1730415486, model='gpt-4o-2024-08-06', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_159d8341cc', usage=None)
2024-10-31 22:58:08.020681 ChatCompletionChunk(id='chatcmpl-AOYP84xNJx6UN7Zb3RuXY3kqtV27g', choices=[Choice(delta=ChoiceDelta(content=None, function_call=None, role=None, tool_calls=[ChoiceDeltaToolCall(index=3, id=None, function=ChoiceDeltaToolCallFunction(arguments='{"pa', name=None), type=None)]), finish_reason=None, index=0, logprobs=None)], created=1730415486, model='gpt-4o-2024-08-06', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_159d8341cc', usage=None)
2024-10-31 22:58:08.020968 ChatCompletionChunk(id='chatcmpl-AOYP84xNJx6UN7Zb3RuXY3kqtV27g', choices=[Choice(delta=ChoiceDelta(content=None, function_call=None, role=None, tool_calls=[ChoiceDeltaToolCall(index=3, id=None, function=ChoiceDeltaToolCallFunction(arguments='ram1"', name=None), type=None)]), finish_reason=None, index=0, logprobs=None)], created=1730415486, model='gpt-4o-2024-08-06', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_159d8341cc', usage=None)
2024-10-31 22:58:08.021252 ChatCompletionChunk(id='chatcmpl-AOYP84xNJx6UN7Zb3RuXY3kqtV27g', choices=[Choice(delta=ChoiceDelta(content=None, function_call=None, role=None, tool_calls=[ChoiceDeltaToolCall(index=3, id=None, function=ChoiceDeltaToolCallFunction(arguments=': "tes', name=None), type=None)]), finish_reason=None, index=0, logprobs=None)], created=1730415486, model='gpt-4o-2024-08-06', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_159d8341cc', usage=None)
2024-10-31 22:58:08.021535 ChatCompletionChunk(id='chatcmpl-AOYP84xNJx6UN7Zb3RuXY3kqtV27g', choices=[Choice(delta=ChoiceDelta(content=None, function_call=None, role=None, tool_calls=[ChoiceDeltaToolCall(index=3, id=None, function=ChoiceDeltaToolCallFunction(arguments='t4"}', name=None), type=None)]), finish_reason=None, index=0, logprobs=None)], created=1730415486, model='gpt-4o-2024-08-06', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_159d8341cc', usage=None)
2024-10-31 22:58:08.021835 ChatCompletionChunk(id='chatcmpl-AOYP84xNJx6UN7Zb3RuXY3kqtV27g', choices=[Choice(delta=ChoiceDelta(content=None, function_call=None, role=None, tool_calls=[ChoiceDeltaToolCall(index=4, id='call_S0GTFCNNcardvxWgJg5XFbh0', function=ChoiceDeltaToolCallFunction(arguments='', name='fake_tool'), type='function')]), finish_reason=None, index=0, logprobs=None)], created=1730415486, model='gpt-4o-2024-08-06', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_159d8341cc', usage=None)
2024-10-31 22:58:08.022117 ChatCompletionChunk(id='chatcmpl-AOYP84xNJx6UN7Zb3RuXY3kqtV27g', choices=[Choice(delta=ChoiceDelta(content=None, function_call=None, role=None, tool_calls=[ChoiceDeltaToolCall(index=4, id=None, function=ChoiceDeltaToolCallFunction(arguments='{"pa', name=None), type=None)]), finish_reason=None, index=0, logprobs=None)], created=1730415486, model='gpt-4o-2024-08-06', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_159d8341cc', usage=None)
2024-10-31 22:58:08.022394 ChatCompletionChunk(id='chatcmpl-AOYP84xNJx6UN7Zb3RuXY3kqtV27g', choices=[Choice(delta=ChoiceDelta(content=None, function_call=None, role=None, tool_calls=[ChoiceDeltaToolCall(index=4, id=None, function=ChoiceDeltaToolCallFunction(arguments='ram1"', name=None), type=None)]), finish_reason=None, index=0, logprobs=None)], created=1730415486, model='gpt-4o-2024-08-06', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_159d8341cc', usage=None)
2024-10-31 22:58:08.022692 ChatCompletionChunk(id='chatcmpl-AOYP84xNJx6UN7Zb3RuXY3kqtV27g', choices=[Choice(delta=ChoiceDelta(content=None, function_call=None, role=None, tool_calls=[ChoiceDeltaToolCall(index=4, id=None, function=ChoiceDeltaToolCallFunction(arguments=': "tes', name=None), type=None)]), finish_reason=None, index=0, logprobs=None)], created=1730415486, model='gpt-4o-2024-08-06', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_159d8341cc', usage=None)
2024-10-31 22:58:08.022975 ChatCompletionChunk(id='chatcmpl-AOYP84xNJx6UN7Zb3RuXY3kqtV27g', choices=[Choice(delta=ChoiceDelta(content=None, function_call=None, role=None, tool_calls=[ChoiceDeltaToolCall(index=4, id=None, function=ChoiceDeltaToolCallFunction(arguments='t5"}', name=None), type=None)]), finish_reason=None, index=0, logprobs=None)], created=1730415486, model='gpt-4o-2024-08-06', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_159d8341cc', usage=None)
2024-10-31 22:58:08.023645 ChatCompletionChunk(id='chatcmpl-AOYP84xNJx6UN7Zb3RuXY3kqtV27g', choices=[Choice(delta=ChoiceDelta(content=None, function_call=None, role=None, tool_calls=None), finish_reason='tool_calls', index=0, logprobs=None)], created=1730415486, model='gpt-4o-2024-08-06', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_159d8341cc', usage=None)
2024-10-31 22:58:08.023963 end

If I remove the tool call everything works as expected

Example code

import datetime
from openai import OpenAI

# Initialize OpenAI client
client = OpenAI()
print(datetime.datetime.now(), "start")
response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "system", "content": "write a short story 20 words"}],
    stream=True,
    temperature=0.0
)
for chunk in response:
    print(datetime.datetime.now(), chunk)
print(datetime.datetime.now(), "end")

And the streaming output

2024-10-31 23:09:15.337921 start
2024-10-31 23:09:15.761900 ChatCompletionChunk(id='chatcmpl-AOYZvcVM11oiYfrCW1L7nIlcQR5FF', choices=[Choice(delta=ChoiceDelta(content='', function_call=None, role='assistant', tool_calls=None, refusal=None), finish_reason=None, index=0, logprobs=None)], created=1730416155, model='gpt-4o-2024-08-06', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_45cf54deae', usage=None)
2024-10-31 23:09:15.781246 ChatCompletionChunk(id='chatcmpl-AOYZvcVM11oiYfrCW1L7nIlcQR5FF', choices=[Choice(delta=ChoiceDelta(content='In', function_call=None, role=None, tool_calls=None), finish_reason=None, index=0, logprobs=None)], created=1730416155, model='gpt-4o-2024-08-06', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_45cf54deae', usage=None)
2024-10-31 23:09:15.781533 ChatCompletionChunk(id='chatcmpl-AOYZvcVM11oiYfrCW1L7nIlcQR5FF', choices=[Choice(delta=ChoiceDelta(content=' the', function_call=None, role=None, tool_calls=None), finish_reason=None, index=0, logprobs=None)], created=1730416155, model='gpt-4o-2024-08-06', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_45cf54deae', usage=None)
2024-10-31 23:09:15.799272 ChatCompletionChunk(id='chatcmpl-AOYZvcVM11oiYfrCW1L7nIlcQR5FF', choices=[Choice(delta=ChoiceDelta(content=' quiet', function_call=None, role=None, tool_calls=None), finish_reason=None, index=0, logprobs=None)], created=1730416155, model='gpt-4o-2024-08-06', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_45cf54deae', usage=None)
2024-10-31 23:09:15.799556 ChatCompletionChunk(id='chatcmpl-AOYZvcVM11oiYfrCW1L7nIlcQR5FF', choices=[Choice(delta=ChoiceDelta(content=' forest', function_call=None, role=None, tool_calls=None), finish_reason=None, index=0, logprobs=None)], created=1730416155, model='gpt-4o-2024-08-06', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_45cf54deae', usage=None)
2024-10-31 23:09:15.834239 ChatCompletionChunk(id='chatcmpl-AOYZvcVM11oiYfrCW1L7nIlcQR5FF', choices=[Choice(delta=ChoiceDelta(content=',', function_call=None, role=None, tool_calls=None), finish_reason=None, index=0, logprobs=None)], created=1730416155, model='gpt-4o-2024-08-06', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_45cf54deae', usage=None)
2024-10-31 23:09:15.834515 ChatCompletionChunk(id='chatcmpl-AOYZvcVM11oiYfrCW1L7nIlcQR5FF', choices=[Choice(delta=ChoiceDelta(content=' a', function_call=None, role=None, tool_calls=None), finish_reason=None, index=0, logprobs=None)], created=1730416155, model='gpt-4o-2024-08-06', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_45cf54deae', usage=None)
2024-10-31 23:09:15.836006 ChatCompletionChunk(id='chatcmpl-AOYZvcVM11oiYfrCW1L7nIlcQR5FF', choices=[Choice(delta=ChoiceDelta(content=' lost', function_call=None, role=None, tool_calls=None), finish_reason=None, index=0, logprobs=None)], created=1730416155, model='gpt-4o-2024-08-06', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_45cf54deae', usage=None)
2024-10-31 23:09:15.836290 ChatCompletionChunk(id='chatcmpl-AOYZvcVM11oiYfrCW1L7nIlcQR5FF', choices=[Choice(delta=ChoiceDelta(content=' kitten', function_call=None, role=None, tool_calls=None), finish_reason=None, index=0, logprobs=None)], created=1730416155, model='gpt-4o-2024-08-06', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_45cf54deae', usage=None)
2024-10-31 23:09:15.842158 ChatCompletionChunk(id='chatcmpl-AOYZvcVM11oiYfrCW1L7nIlcQR5FF', choices=[Choice(delta=ChoiceDelta(content=' found', function_call=None, role=None, tool_calls=None), finish_reason=None, index=0, logprobs=None)], created=1730416155, model='gpt-4o-2024-08-06', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_45cf54deae', usage=None)
2024-10-31 23:09:15.842424 ChatCompletionChunk(id='chatcmpl-AOYZvcVM11oiYfrCW1L7nIlcQR5FF', choices=[Choice(delta=ChoiceDelta(content=' a', function_call=None, role=None, tool_calls=None), finish_reason=None, index=0, logprobs=None)], created=1730416155, model='gpt-4o-2024-08-06', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_45cf54deae', usage=None)
2024-10-31 23:09:15.886468 ChatCompletionChunk(id='chatcmpl-AOYZvcVM11oiYfrCW1L7nIlcQR5FF', choices=[Choice(delta=ChoiceDelta(content=' glowing', function_call=None, role=None, tool_calls=None), finish_reason=None, index=0, logprobs=None)], created=1730416155, model='gpt-4o-2024-08-06', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_45cf54deae', usage=None)
2024-10-31 23:09:15.886783 ChatCompletionChunk(id='chatcmpl-AOYZvcVM11oiYfrCW1L7nIlcQR5FF', choices=[Choice(delta=ChoiceDelta(content=' mushroom', function_call=None, role=None, tool_calls=None), finish_reason=None, index=0, logprobs=None)], created=1730416155, model='gpt-4o-2024-08-06', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_45cf54deae', usage=None)
2024-10-31 23:09:15.887965 ChatCompletionChunk(id='chatcmpl-AOYZvcVM11oiYfrCW1L7nIlcQR5FF', choices=[Choice(delta=ChoiceDelta(content='.', function_call=None, role=None, tool_calls=None), finish_reason=None, index=0, logprobs=None)], created=1730416155, model='gpt-4o-2024-08-06', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_45cf54deae', usage=None)
2024-10-31 23:09:15.888234 ChatCompletionChunk(id='chatcmpl-AOYZvcVM11oiYfrCW1L7nIlcQR5FF', choices=[Choice(delta=ChoiceDelta(content=' It', function_call=None, role=None, tool_calls=None), finish_reason=None, index=0, logprobs=None)], created=1730416155, model='gpt-4o-2024-08-06', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_45cf54deae', usage=None)
2024-10-31 23:09:15.895099 ChatCompletionChunk(id='chatcmpl-AOYZvcVM11oiYfrCW1L7nIlcQR5FF', choices=[Choice(delta=ChoiceDelta(content=' touched', function_call=None, role=None, tool_calls=None), finish_reason=None, index=0, logprobs=None)], created=1730416155, model='gpt-4o-2024-08-06', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_45cf54deae', usage=None)
2024-10-31 23:09:15.895366 ChatCompletionChunk(id='chatcmpl-AOYZvcVM11oiYfrCW1L7nIlcQR5FF', choices=[Choice(delta=ChoiceDelta(content=' it', function_call=None, role=None, tool_calls=None), finish_reason=None, index=0, logprobs=None)], created=1730416155, model='gpt-4o-2024-08-06', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_45cf54deae', usage=None)
2024-10-31 23:09:15.903020 ChatCompletionChunk(id='chatcmpl-AOYZvcVM11oiYfrCW1L7nIlcQR5FF', choices=[Choice(delta=ChoiceDelta(content=',', function_call=None, role=None, tool_calls=None), finish_reason=None, index=0, logprobs=None)], created=1730416155, model='gpt-4o-2024-08-06', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_45cf54deae', usage=None)
2024-10-31 23:09:15.903289 ChatCompletionChunk(id='chatcmpl-AOYZvcVM11oiYfrCW1L7nIlcQR5FF', choices=[Choice(delta=ChoiceDelta(content=' transforming', function_call=None, role=None, tool_calls=None), finish_reason=None, index=0, logprobs=None)], created=1730416155, model='gpt-4o-2024-08-06', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_45cf54deae', usage=None)
2024-10-31 23:09:15.920959 ChatCompletionChunk(id='chatcmpl-AOYZvcVM11oiYfrCW1L7nIlcQR5FF', choices=[Choice(delta=ChoiceDelta(content=' into', function_call=None, role=None, tool_calls=None), finish_reason=None, index=0, logprobs=None)], created=1730416155, model='gpt-4o-2024-08-06', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_45cf54deae', usage=None)
2024-10-31 23:09:15.921228 ChatCompletionChunk(id='chatcmpl-AOYZvcVM11oiYfrCW1L7nIlcQR5FF', choices=[Choice(delta=ChoiceDelta(content=' a', function_call=None, role=None, tool_calls=None), finish_reason=None, index=0, logprobs=None)], created=1730416155, model='gpt-4o-2024-08-06', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_45cf54deae', usage=None)
2024-10-31 23:09:15.935233 ChatCompletionChunk(id='chatcmpl-AOYZvcVM11oiYfrCW1L7nIlcQR5FF', choices=[Choice(delta=ChoiceDelta(content=' majestic', function_call=None, role=None, tool_calls=None), finish_reason=None, index=0, logprobs=None)], created=1730416155, model='gpt-4o-2024-08-06', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_45cf54deae', usage=None)
2024-10-31 23:09:15.935529 ChatCompletionChunk(id='chatcmpl-AOYZvcVM11oiYfrCW1L7nIlcQR5FF', choices=[Choice(delta=ChoiceDelta(content=',', function_call=None, role=None, tool_calls=None), finish_reason=None, index=0, logprobs=None)], created=1730416155, model='gpt-4o-2024-08-06', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_45cf54deae', usage=None)
2024-10-31 23:09:15.955541 ChatCompletionChunk(id='chatcmpl-AOYZvcVM11oiYfrCW1L7nIlcQR5FF', choices=[Choice(delta=ChoiceDelta(content=' wise', function_call=None, role=None, tool_calls=None), finish_reason=None, index=0, logprobs=None)], created=1730416155, model='gpt-4o-2024-08-06', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_45cf54deae', usage=None)
2024-10-31 23:09:15.955835 ChatCompletionChunk(id='chatcmpl-AOYZvcVM11oiYfrCW1L7nIlcQR5FF', choices=[Choice(delta=ChoiceDelta(content=' cat', function_call=None, role=None, tool_calls=None), finish_reason=None, index=0, logprobs=None)], created=1730416155, model='gpt-4o-2024-08-06', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_45cf54deae', usage=None)
2024-10-31 23:09:15.956084 ChatCompletionChunk(id='chatcmpl-AOYZvcVM11oiYfrCW1L7nIlcQR5FF', choices=[Choice(delta=ChoiceDelta(content='.', function_call=None, role=None, tool_calls=None), finish_reason=None, index=0, logprobs=None)], created=1730416155, model='gpt-4o-2024-08-06', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_45cf54deae', usage=None)
2024-10-31 23:09:15.956861 ChatCompletionChunk(id='chatcmpl-AOYZvcVM11oiYfrCW1L7nIlcQR5FF', choices=[Choice(delta=ChoiceDelta(content=None, function_call=None, role=None, tool_calls=None), finish_reason='stop', index=0, logprobs=None)], created=1730416155, model='gpt-4o-2024-08-06', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_45cf54deae', usage=None)
2024-10-31 23:09:15.960319 end
1 Like

How about if you remove the “proxy” URL and interact directly with OpenAI instead of some packet repackager…

1 Like

Same issues. Good thinking though.

1 Like

I ran it locally, and it seems to perform like expected.

Remember: setting strict:true and using gpt-4o is requesting a structured output of the function.

That requires first creation of a grammar artifact on the server that can be cached and used to enforce output tokens.

Setting strict: False, temperature:0.1, and using gpt-4o-2024-05-13 gives this latency:

2024-10-31 16:21:27.649820 start
2024-10-31 16:21:28.996714 ChatCompletionChunk(id='chatcmpl


No big difference with AsyncIO

import datetime
import asyncio
from openai import AsyncOpenAI

client = AsyncOpenAI()

async def main():

    print(datetime.datetime.now(), "start")
    response = await client.chat.completions.create(
        model="gpt-4o-2024-05-13",
        messages=[{"role": "system", "content": "call the fake_tool five times with different parameters"}],
        tools=[
            {
                "type": "function",
                "function": {
                    "name": "fake_tool",
                    "description": "A fake tool for testing purposes.",
                    "parameters": {
                        "type": "object",
                        "properties": {
                            "param1": {
                                "type": "string",
                                "description": "A fake parameter."
                            }
                        },
                        "required": ["param1"],
                        "additionalProperties": False
                    },
                    "strict": False
                }
            }
        ],
        tool_choice="required",
        stream=True,
        temperature=0.1
    )
    async for chunk in response:
        print(datetime.datetime.now(), chunk)
    print(datetime.datetime.now(), "end")


asyncio.run(main())
2 Likes

Thank you for you’re help, but I ran your code and I’m getting the same issue. Note that the issue isn’t the total time, it’s time the till the first chunk, and the time till last chunk that are important. Only change I made was ask the LLM to call the too 15 times so it would be more pronounced. Time to first chunk was ~4s and time till last chunk was also ~4s meaning all the chunks showed up at the same time.

2024-10-31 23:41:41.782687 start
2024-10-31 23:41:45.666760 ChatCompletionChunk(id='chatcmpl-AOZ5K2IpuZLOsSYDjPTDiEEl5S9Jg', choices=[Choice(delta=ChoiceDelta(content=None, function_call=None, role='assistant', tool_calls=None), finish_reason=None, index=0, logprobs=None)], created=1730418102, model='gpt-4o-2024-08-06', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_159d8341cc', usage=None)
2024-10-31 23:41:45.672381 ChatCompletionChunk(id='chatcmpl-AOZ5K2IpuZLOsSYDjPTDiEEl5S9Jg', choices=[Choice(delta=ChoiceDelta(content=None, function_call=None, role=None, tool_calls=[ChoiceDeltaToolCall(index=0, id='call_C8lYuCHYkyml4KAnarxGe5Zg', function=ChoiceDeltaToolCallFunction(arguments='', name='fake_tool'), type='function')]), finish_reason=None, index=0, logprobs=None)], created=1730418102, model='gpt-4o-2024-08-06', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_159d8341cc', usage=None)
2024-10-31 23:41:45.673171 ChatCompletionChunk(id='chatcmpl-AOZ5K2IpuZLOsSYDjPTDiEEl5S9Jg', choices=[Choice(delta=ChoiceDelta(content=None, function_call=None, role=None, tool_calls=[ChoiceDeltaToolCall(index=0, id=None, function=ChoiceDeltaToolCallFunction(arguments='{"pa', name=None), type=None)]), finish_reason=None, index=0, logprobs=None)], created=1730418102, model='gpt-4o-2024-08-06', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_159d8341cc', usage=None)
2024-10-31 23:41:45.673900 ChatCompletionChunk(id='chatcmpl-AOZ5K2IpuZLOsSYDjPTDiEEl5S9Jg', choices=[Choice(delta=ChoiceDelta(content=None, function_call=None, role=None, tool_calls=[ChoiceDeltaToolCall(index=0, id=None, function=ChoiceDeltaToolCallFunction(arguments='ram1"', name=None), type=None)]), finish_reason=None, index=0, logprobs=None)], created=1730418102, model='gpt-4o-2024-08-06', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_159d8341cc', usage=None)
2024-10-31 23:41:45.674840 ChatCompletionChunk(id='chatcmpl-AOZ5K2IpuZLOsSYDjPTDiEEl5S9Jg', choices=[Choice(delta=ChoiceDelta(content=None, function_call=None, role=None, tool_calls=[ChoiceDeltaToolCall(index=0, id=None, function=ChoiceDeltaToolCallFunction(arguments=': "tes', name=None), type=None)]), finish_reason=None, index=0, logprobs=None)], created=1730418102, model='gpt-4o-2024-08-06', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_159d8341cc', usage=None)
2024-10-31 23:41:45.675781 ChatCompletionChunk(id='chatcmpl-AOZ5K2IpuZLOsSYDjPTDiEEl5S9Jg', choices=[Choice(delta=ChoiceDelta(content=None, function_call=None, role=None, tool_calls=[ChoiceDeltaToolCall(index=0, id=None, function=ChoiceDeltaToolCallFunction(arguments='t1"}', name=None), type=None)]), finish_reason=None, index=0, logprobs=None)], created=1730418102, model='gpt-4o-2024-08-06', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_159d8341cc', usage=None)
2024-10-31 23:41:45.676443 ChatCompletionChunk(id='chatcmpl-AOZ5K2IpuZLOsSYDjPTDiEEl5S9Jg', choices=[Choice(delta=ChoiceDelta(content=None, function_call=None, role=None, tool_calls=[ChoiceDeltaToolCall(index=1, id='call_o1IJ9BPEEbeXklrwn2Smz2mA', function=ChoiceDeltaToolCallFunction(arguments='', name='fake_tool'), type='function')]), finish_reason=None, index=0, logprobs=None)], created=1730418102, model='gpt-4o-2024-08-06', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_159d8341cc', usage=None)
2024-10-31 23:41:45.677104 ChatCompletionChunk(id='chatcmpl-AOZ5K2IpuZLOsSYDjPTDiEEl5S9Jg', choices=[Choice(delta=ChoiceDelta(content=None, function_call=None, role=None, tool_calls=[ChoiceDeltaToolCall(index=1, id=None, function=ChoiceDeltaToolCallFunction(arguments='{"pa', name=None), type=None)]), finish_reason=None, index=0, logprobs=None)], created=1730418102, model='gpt-4o-2024-08-06', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_159d8341cc', usage=None)
2024-10-31 23:41:45.677826 ChatCompletionChunk(id='chatcmpl-AOZ5K2IpuZLOsSYDjPTDiEEl5S9Jg', choices=[Choice(delta=ChoiceDelta(content=None, function_call=None, role=None, tool_calls=[ChoiceDeltaToolCall(index=1, id=None, function=ChoiceDeltaToolCallFunction(arguments='ram1"', name=None), type=None)]), finish_reason=None, index=0, logprobs=None)], created=1730418102, model='gpt-4o-2024-08-06', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_159d8341cc', usage=None)
2024-10-31 23:41:45.678478 ChatCompletionChunk(id='chatcmpl-AOZ5K2IpuZLOsSYDjPTDiEEl5S9Jg', choices=[Choice(delta=ChoiceDelta(content=None, function_call=None, role=None, tool_calls=[ChoiceDeltaToolCall(index=1, id=None, function=ChoiceDeltaToolCallFunction(arguments=': "tes', name=None), type=None)]), finish_reason=None, index=0, logprobs=None)], created=1730418102, model='gpt-4o-2024-08-06', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_159d8341cc', usage=None)
2024-10-31 23:41:45.679271 ChatCompletionChunk(id='chatcmpl-AOZ5K2IpuZLOsSYDjPTDiEEl5S9Jg', choices=[Choice(delta=ChoiceDelta(content=None, function_call=None, role=None, tool_calls=[ChoiceDeltaToolCall(index=1, id=None, function=ChoiceDeltaToolCallFunction(arguments='t2"}', name=None), type=None)]), finish_reason=None, index=0, logprobs=None)], created=1730418102, model='gpt-4o-2024-08-06', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_159d8341cc', usage=None)
2024-10-31 23:41:45.679972 ChatCompletionChunk(id='chatcmpl-AOZ5K2IpuZLOsSYDjPTDiEEl5S9Jg', choices=[Choice(delta=ChoiceDelta(content=None, function_call=None, role=None, tool_calls=[ChoiceDeltaToolCall(index=2, id='call_cnEq4ExBs4gYosaCsmQCa7qO', function=ChoiceDeltaToolCallFunction(arguments='', name='fake_tool'), type='function')]), finish_reason=None, index=0, logprobs=None)], created=1730418102, model='gpt-4o-2024-08-06', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_159d8341cc', usage=None)
2024-10-31 23:41:45.680961 ChatCompletionChunk(id='chatcmpl-AOZ5K2IpuZLOsSYDjPTDiEEl5S9Jg', choices=[Choice(delta=ChoiceDelta(content=None, function_call=None, role=None, tool_calls=[ChoiceDeltaToolCall(index=2, id=None, function=ChoiceDeltaToolCallFunction(arguments='{"pa', name=None), type=None)]), finish_reason=None, index=0, logprobs=None)], created=1730418102, model='gpt-4o-2024-08-06', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_159d8341cc', usage=None)
2024-10-31 23:41:45.681633 ChatCompletionChunk(id='chatcmpl-AOZ5K2IpuZLOsSYDjPTDiEEl5S9Jg', choices=[Choice(delta=ChoiceDelta(content=None, function_call=None, role=None, tool_calls=[ChoiceDeltaToolCall(index=2, id=None, function=ChoiceDeltaToolCallFunction(arguments='ram1"', name=None), type=None)]), finish_reason=None, index=0, logprobs=None)], created=1730418102, model='gpt-4o-2024-08-06', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_159d8341cc', usage=None)
2024-10-31 23:41:45.682279 ChatCompletionChunk(id='chatcmpl-AOZ5K2IpuZLOsSYDjPTDiEEl5S9Jg', choices=[Choice(delta=ChoiceDelta(content=None, function_call=None, role=None, tool_calls=[ChoiceDeltaToolCall(index=2, id=None, function=ChoiceDeltaToolCallFunction(arguments=': "tes', name=None), type=None)]), finish_reason=None, index=0, logprobs=None)], created=1730418102, model='gpt-4o-2024-08-06', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_159d8341cc', usage=None)
2024-10-31 23:41:45.682944 ChatCompletionChunk(id='chatcmpl-AOZ5K2IpuZLOsSYDjPTDiEEl5S9Jg', choices=[Choice(delta=ChoiceDelta(content=None, function_call=None, role=None, tool_calls=[ChoiceDeltaToolCall(index=2, id=None, function=ChoiceDeltaToolCallFunction(arguments='t3"}', name=None), type=None)]), finish_reason=None, index=0, logprobs=None)], created=1730418102, model='gpt-4o-2024-08-06', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_159d8341cc', usage=None)
2024-10-31 23:41:45.683631 ChatCompletionChunk(id='chatcmpl-AOZ5K2IpuZLOsSYDjPTDiEEl5S9Jg', choices=[Choice(delta=ChoiceDelta(content=None, function_call=None, role=None, tool_calls=[ChoiceDeltaToolCall(index=3, id='call_7y6ycvktSsXeFC5jTnP9uaPT', function=ChoiceDeltaToolCallFunction(arguments='', name='fake_tool'), type='function')]), finish_reason=None, index=0, logprobs=None)], created=1730418102, model='gpt-4o-2024-08-06', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_159d8341cc', usage=None)
2024-10-31 23:41:45.684456 ChatCompletionChunk(id='chatcmpl-AOZ5K2IpuZLOsSYDjPTDiEEl5S9Jg', choices=[Choice(delta=ChoiceDelta(content=None, function_call=None, role=None, tool_calls=[ChoiceDeltaToolCall(index=3, id=None, function=ChoiceDeltaToolCallFunction(arguments='{"pa', name=None), type=None)]), finish_reason=None, index=0, logprobs=None)], created=1730418102, model='gpt-4o-2024-08-06', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_159d8341cc', usage=None)
2024-10-31 23:41:45.685448 ChatCompletionChunk(id='chatcmpl-AOZ5K2IpuZLOsSYDjPTDiEEl5S9Jg', choices=[Choice(delta=ChoiceDelta(content=None, function_call=None, role=None, tool_calls=[ChoiceDeltaToolCall(index=3, id=None, function=ChoiceDeltaToolCallFunction(arguments='ram1"', name=None), type=None)]), finish_reason=None, index=0, logprobs=None)], created=1730418102, model='gpt-4o-2024-08-06', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_159d8341cc', usage=None)
2024-10-31 23:41:45.686120 ChatCompletionChunk(id='chatcmpl-AOZ5K2IpuZLOsSYDjPTDiEEl5S9Jg', choices=[Choice(delta=ChoiceDelta(content=None, function_call=None, role=None, tool_calls=[ChoiceDeltaToolCall(index=3, id=None, function=ChoiceDeltaToolCallFunction(arguments=': "tes', name=None), type=None)]), finish_reason=None, index=0, logprobs=None)], created=1730418102, model='gpt-4o-2024-08-06', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_159d8341cc', usage=None)
2024-10-31 23:41:45.686669 ChatCompletionChunk(id='chatcmpl-AOZ5K2IpuZLOsSYDjPTDiEEl5S9Jg', choices=[Choice(delta=ChoiceDelta(content=None, function_call=None, role=None, tool_calls=[ChoiceDeltaToolCall(index=3, id=None, function=ChoiceDeltaToolCallFunction(arguments='t4"}', name=None), type=None)]), finish_reason=None, index=0, logprobs=None)], created=1730418102, model='gpt-4o-2024-08-06', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_159d8341cc', usage=None)
2024-10-31 23:41:45.687296 ChatCompletionChunk(id='chatcmpl-AOZ5K2IpuZLOsSYDjPTDiEEl5S9Jg', choices=[Choice(delta=ChoiceDelta(content=None, function_call=None, role=None, tool_calls=[ChoiceDeltaToolCall(index=4, id='call_HFfLJXZ5ufYfoZt4wlDfakUB', function=ChoiceDeltaToolCallFunction(arguments='', name='fake_tool'), type='function')]), finish_reason=None, index=0, logprobs=None)], created=1730418102, model='gpt-4o-2024-08-06', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_159d8341cc', usage=None)
2024-10-31 23:41:45.687966 ChatCompletionChunk(id='chatcmpl-AOZ5K2IpuZLOsSYDjPTDiEEl5S9Jg', choices=[Choice(delta=ChoiceDelta(content=None, function_call=None, role=None, tool_calls=[ChoiceDeltaToolCall(index=4, id=None, function=ChoiceDeltaToolCallFunction(arguments='{"pa', name=None), type=None)]), finish_reason=None, index=0, logprobs=None)], created=1730418102, model='gpt-4o-2024-08-06', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_159d8341cc', usage=None)
2024-10-31 23:41:45.711620 ChatCompletionChunk(id='chatcmpl-AOZ5K2IpuZLOsSYDjPTDiEEl5S9Jg', choices=[Choice(delta=ChoiceDelta(content=None, function_call=None, role=None, tool_calls=[ChoiceDeltaToolCall(index=4, id=None, function=ChoiceDeltaToolCallFunction(arguments='ram1"', name=None), type=None)]), finish_reason=None, index=0, logprobs=None)], created=1730418102, model='gpt-4o-2024-08-06', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_159d8341cc', usage=None)
2024-10-31 23:41:45.712127 ChatCompletionChunk(id='chatcmpl-AOZ5K2IpuZLOsSYDjPTDiEEl5S9Jg', choices=[Choice(delta=ChoiceDelta(content=None, function_call=None, role=None, tool_calls=[ChoiceDeltaToolCall(index=4, id=None, function=ChoiceDeltaToolCallFunction(arguments=': "tes', name=None), type=None)]), finish_reason=None, index=0, logprobs=None)], created=1730418102, model='gpt-4o-2024-08-06', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_159d8341cc', usage=None)
2024-10-31 23:41:45.712573 ChatCompletionChunk(id='chatcmpl-AOZ5K2IpuZLOsSYDjPTDiEEl5S9Jg', choices=[Choice(delta=ChoiceDelta(content=None, function_call=None, role=None, tool_calls=[ChoiceDeltaToolCall(index=4, id=None, function=ChoiceDeltaToolCallFunction(arguments='t5"}', name=None), type=None)]), finish_reason=None, index=0, logprobs=None)], created=1730418102, model='gpt-4o-2024-08-06', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_159d8341cc', usage=None)
2024-10-31 23:41:45.713077 ChatCompletionChunk(id='chatcmpl-AOZ5K2IpuZLOsSYDjPTDiEEl5S9Jg', choices=[Choice(delta=ChoiceDelta(content=None, function_call=None, role=None, tool_calls=[ChoiceDeltaToolCall(index=5, id='call_rLCzt1CoN3K7FOfYPrFJuKfb', function=ChoiceDeltaToolCallFunction(arguments='', name='fake_tool'), type='function')]), finish_reason=None, index=0, logprobs=None)], created=1730418102, model='gpt-4o-2024-08-06', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_159d8341cc', usage=None)
2024-10-31 23:41:45.713517 ChatCompletionChunk(id='chatcmpl-AOZ5K2IpuZLOsSYDjPTDiEEl5S9Jg', choices=[Choice(delta=ChoiceDelta(content=None, function_call=None, role=None, tool_calls=[ChoiceDeltaToolCall(index=5, id=None, function=ChoiceDeltaToolCallFunction(arguments='{"pa', name=None), type=None)]), finish_reason=None, index=0, logprobs=None)], created=1730418102, model='gpt-4o-2024-08-06', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_159d8341cc', usage=None)
2024-10-31 23:41:45.714015 ChatCompletionChunk(id='chatcmpl-AOZ5K2IpuZLOsSYDjPTDiEEl5S9Jg', choices=[Choice(delta=ChoiceDelta(content=None, function_call=None, role=None, tool_calls=[ChoiceDeltaToolCall(index=5, id=None, function=ChoiceDeltaToolCallFunction(arguments='ram1"', name=None), type=None)]), finish_reason=None, index=0, logprobs=None)], created=1730418102, model='gpt-4o-2024-08-06', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_159d8341cc', usage=None)
2024-10-31 23:41:45.714497 ChatCompletionChunk(id='chatcmpl-AOZ5K2IpuZLOsSYDjPTDiEEl5S9Jg', choices=[Choice(delta=ChoiceDelta(content=None, function_call=None, role=None, tool_calls=[ChoiceDeltaToolCall(index=5, id=None, function=ChoiceDeltaToolCallFunction(arguments=': "tes', name=None), type=None)]), finish_reason=None, index=0, logprobs=None)], created=1730418102, model='gpt-4o-2024-08-06', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_159d8341cc', usage=None)
2024-10-31 23:41:45.719154 ChatCompletionChunk(id='chatcmpl-AOZ5K2IpuZLOsSYDjPTDiEEl5S9Jg', choices=[Choice(delta=ChoiceDelta(content=None, function_call=None, role=None, tool_calls=[ChoiceDeltaToolCall(index=5, id=None, function=ChoiceDeltaToolCallFunction(arguments='t6"}', name=None), type=None)]), finish_reason=None, index=0, logprobs=None)], created=1730418102, model='gpt-4o-2024-08-06', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_159d8341cc', usage=None)
2024-10-31 23:41:45.719711 ChatCompletionChunk(id='chatcmpl-AOZ5K2IpuZLOsSYDjPTDiEEl5S9Jg', choices=[Choice(delta=ChoiceDelta(content=None, function_call=None, role=None, tool_calls=[ChoiceDeltaToolCall(index=6, id='call_CEhwIJT5m67ud5rKWA42R8jJ', function=ChoiceDeltaToolCallFunction(arguments='', name='fake_tool'), type='function')]), finish_reason=None, index=0, logprobs=None)], created=1730418102, model='gpt-4o-2024-08-06', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_159d8341cc', usage=None)
2024-10-31 23:41:45.720228 ChatCompletionChunk(id='chatcmpl-AOZ5K2IpuZLOsSYDjPTDiEEl5S9Jg', choices=[Choice(delta=ChoiceDelta(content=None, function_call=None, role=None, tool_calls=[ChoiceDeltaToolCall(index=6, id=None, function=ChoiceDeltaToolCallFunction(arguments='{"pa', name=None), type=None)]), finish_reason=None, index=0, logprobs=None)], created=1730418102, model='gpt-4o-2024-08-06', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_159d8341cc', usage=None)
2024-10-31 23:41:45.721912 ChatCompletionChunk(id='chatcmpl-AOZ5K2IpuZLOsSYDjPTDiEEl5S9Jg', choices=[Choice(delta=ChoiceDelta(content=None, function_call=None, role=None, tool_calls=[ChoiceDeltaToolCall(index=6, id=None, function=ChoiceDeltaToolCallFunction(arguments='ram1"', name=None), type=None)]), finish_reason=None, index=0, logprobs=None)], created=1730418102, model='gpt-4o-2024-08-06', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_159d8341cc', usage=None)
2024-10-31 23:41:45.725038 ChatCompletionChunk(id='chatcmpl-AOZ5K2IpuZLOsSYDjPTDiEEl5S9Jg', choices=[Choice(delta=ChoiceDelta(content=None, function_call=None, role=None, tool_calls=[ChoiceDeltaToolCall(index=6, id=None, function=ChoiceDeltaToolCallFunction(arguments=': "tes', name=None), type=None)]), finish_reason=None, index=0, logprobs=None)], created=1730418102, model='gpt-4o-2024-08-06', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_159d8341cc', usage=None)
2024-10-31 23:41:45.725573 ChatCompletionChunk(id='chatcmpl-AOZ5K2IpuZLOsSYDjPTDiEEl5S9Jg', choices=[Choice(delta=ChoiceDelta(content=None, function_call=None, role=None, tool_calls=[ChoiceDeltaToolCall(index=6, id=None, function=ChoiceDeltaToolCallFunction(arguments='t7"}', name=None), type=None)]), finish_reason=None, index=0, logprobs=None)], created=1730418102, model='gpt-4o-2024-08-06', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_159d8341cc', usage=None)
2024-10-31 23:41:45.726101 ChatCompletionChunk(id='chatcmpl-AOZ5K2IpuZLOsSYDjPTDiEEl5S9Jg', choices=[Choice(delta=ChoiceDelta(content=None, function_call=None, role=None, tool_calls=[ChoiceDeltaToolCall(index=7, id='call_bDzb67cZSa3LkKoTiksRSytG', function=ChoiceDeltaToolCallFunction(arguments='', name='fake_tool'), type='function')]), finish_reason=None, index=0, logprobs=None)], created=1730418102, model='gpt-4o-2024-08-06', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_159d8341cc', usage=None)
2024-10-31 23:41:45.726635 ChatCompletionChunk(id='chatcmpl-AOZ5K2IpuZLOsSYDjPTDiEEl5S9Jg', choices=[Choice(delta=ChoiceDelta(content=None, function_call=None, role=None, tool_calls=[ChoiceDeltaToolCall(index=7, id=None, function=ChoiceDeltaToolCallFunction(arguments='{"pa', name=None), type=None)]), finish_reason=None, index=0, logprobs=None)], created=1730418102, model='gpt-4o-2024-08-06', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_159d8341cc', usage=None)
2024-10-31 23:41:45.727139 ChatCompletionChunk(id='chatcmpl-AOZ5K2IpuZLOsSYDjPTDiEEl5S9Jg', choices=[Choice(delta=ChoiceDelta(content=None, function_call=None, role=None, tool_calls=[ChoiceDeltaToolCall(index=7, id=None, function=ChoiceDeltaToolCallFunction(arguments='ram1"', name=None), type=None)]), finish_reason=None, index=0, logprobs=None)], created=1730418102, model='gpt-4o-2024-08-06', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_159d8341cc', usage=None)
2024-10-31 23:41:45.728491 ChatCompletionChunk(id='chatcmpl-AOZ5K2IpuZLOsSYDjPTDiEEl5S9Jg', choices=[Choice(delta=ChoiceDelta(content=None, function_call=None, role=None, tool_calls=[ChoiceDeltaToolCall(index=7, id=None, function=ChoiceDeltaToolCallFunction(arguments=': "tes', name=None), type=None)]), finish_reason=None, index=0, logprobs=None)], created=1730418102, model='gpt-4o-2024-08-06', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_159d8341cc', usage=None)
2024-10-31 23:41:45.729051 ChatCompletionChunk(id='chatcmpl-AOZ5K2IpuZLOsSYDjPTDiEEl5S9Jg', choices=[Choice(delta=ChoiceDelta(content=None, function_call=None, role=None, tool_calls=[ChoiceDeltaToolCall(index=7, id=None, function=ChoiceDeltaToolCallFunction(arguments='t8"}', name=None), type=None)]), finish_reason=None, index=0, logprobs=None)], created=1730418102, model='gpt-4o-2024-08-06', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_159d8341cc', usage=None)
2024-10-31 23:41:45.729561 ChatCompletionChunk(id='chatcmpl-AOZ5K2IpuZLOsSYDjPTDiEEl5S9Jg', choices=[Choice(delta=ChoiceDelta(content=None, function_call=None, role=None, tool_calls=[ChoiceDeltaToolCall(index=8, id='call_wUWmdMCS3fITlUabSombEwCI', function=ChoiceDeltaToolCallFunction(arguments='', name='fake_tool'), type='function')]), finish_reason=None, index=0, logprobs=None)], created=1730418102, model='gpt-4o-2024-08-06', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_159d8341cc', usage=None)
2024-10-31 23:41:45.730094 ChatCompletionChunk(id='chatcmpl-AOZ5K2IpuZLOsSYDjPTDiEEl5S9Jg', choices=[Choice(delta=ChoiceDelta(content=None, function_call=None, role=None, tool_calls=[ChoiceDeltaToolCall(index=8, id=None, function=ChoiceDeltaToolCallFunction(arguments='{"pa', name=None), type=None)]), finish_reason=None, index=0, logprobs=None)], created=1730418102, model='gpt-4o-2024-08-06', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_159d8341cc', usage=None)
2024-10-31 23:41:45.730612 ChatCompletionChunk(id='chatcmpl-AOZ5K2IpuZLOsSYDjPTDiEEl5S9Jg', choices=[Choice(delta=ChoiceDelta(content=None, function_call=None, role=None, tool_calls=[ChoiceDeltaToolCall(index=8, id=None, function=ChoiceDeltaToolCallFunction(arguments='ram1"', name=None), type=None)]), finish_reason=None, index=0, logprobs=None)], created=1730418102, model='gpt-4o-2024-08-06', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_159d8341cc', usage=None)
2024-10-31 23:41:45.731121 ChatCompletionChunk(id='chatcmpl-AOZ5K2IpuZLOsSYDjPTDiEEl5S9Jg', choices=[Choice(delta=ChoiceDelta(content=None, function_call=None, role=None, tool_calls=[ChoiceDeltaToolCall(index=8, id=None, function=ChoiceDeltaToolCallFunction(arguments=': "tes', name=None), type=None)]), finish_reason=None, index=0, logprobs=None)], created=1730418102, model='gpt-4o-2024-08-06', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_159d8341cc', usage=None)
2024-10-31 23:41:45.731650 ChatCompletionChunk(id='chatcmpl-AOZ5K2IpuZLOsSYDjPTDiEEl5S9Jg', choices=[Choice(delta=ChoiceDelta(content=None, function_call=None, role=None, tool_calls=[ChoiceDeltaToolCall(index=8, id=None, function=ChoiceDeltaToolCallFunction(arguments='t9"}', name=None), type=None)]), finish_reason=None, index=0, logprobs=None)], created=1730418102, model='gpt-4o-2024-08-06', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_159d8341cc', usage=None)
2024-10-31 23:41:45.732152 ChatCompletionChunk(id='chatcmpl-AOZ5K2IpuZLOsSYDjPTDiEEl5S9Jg', choices=[Choice(delta=ChoiceDelta(content=None, function_call=None, role=None, tool_calls=[ChoiceDeltaToolCall(index=9, id='call_pbmReqqw988VcUTD5dSm2f7W', function=ChoiceDeltaToolCallFunction(arguments='', name='fake_tool'), type='function')]), finish_reason=None, index=0, logprobs=None)], created=1730418102, model='gpt-4o-2024-08-06', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_159d8341cc', usage=None)
2024-10-31 23:41:45.732674 ChatCompletionChunk(id='chatcmpl-AOZ5K2IpuZLOsSYDjPTDiEEl5S9Jg', choices=[Choice(delta=ChoiceDelta(content=None, function_call=None, role=None, tool_calls=[ChoiceDeltaToolCall(index=9, id=None, function=ChoiceDeltaToolCallFunction(arguments='{"pa', name=None), type=None)]), finish_reason=None, index=0, logprobs=None)], created=1730418102, model='gpt-4o-2024-08-06', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_159d8341cc', usage=None)
2024-10-31 23:41:45.733186 ChatCompletionChunk(id='chatcmpl-AOZ5K2IpuZLOsSYDjPTDiEEl5S9Jg', choices=[Choice(delta=ChoiceDelta(content=None, function_call=None, role=None, tool_calls=[ChoiceDeltaToolCall(index=9, id=None, function=ChoiceDeltaToolCallFunction(arguments='ram1"', name=None), type=None)]), finish_reason=None, index=0, logprobs=None)], created=1730418102, model='gpt-4o-2024-08-06', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_159d8341cc', usage=None)
2024-10-31 23:41:45.733715 ChatCompletionChunk(id='chatcmpl-AOZ5K2IpuZLOsSYDjPTDiEEl5S9Jg', choices=[Choice(delta=ChoiceDelta(content=None, function_call=None, role=None, tool_calls=[ChoiceDeltaToolCall(index=9, id=None, function=ChoiceDeltaToolCallFunction(arguments=': "tes', name=None), type=None)]), finish_reason=None, index=0, logprobs=None)], created=1730418102, model='gpt-4o-2024-08-06', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_159d8341cc', usage=None)
2024-10-31 23:41:45.734245 ChatCompletionChunk(id='chatcmpl-AOZ5K2IpuZLOsSYDjPTDiEEl5S9Jg', choices=[Choice(delta=ChoiceDelta(content=None, function_call=None, role=None, tool_calls=[ChoiceDeltaToolCall(index=9, id=None, function=ChoiceDeltaToolCallFunction(arguments='t10"', name=None), type=None)]), finish_reason=None, index=0, logprobs=None)], created=1730418102, model='gpt-4o-2024-08-06', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_159d8341cc', usage=None)
2024-10-31 23:41:45.738752 ChatCompletionChunk(id='chatcmpl-AOZ5K2IpuZLOsSYDjPTDiEEl5S9Jg', choices=[Choice(delta=ChoiceDelta(content=None, function_call=None, role=None, tool_calls=[ChoiceDeltaToolCall(index=9, id=None, function=ChoiceDeltaToolCallFunction(arguments='}', name=None), type=None)]), finish_reason=None, index=0, logprobs=None)], created=1730418102, model='gpt-4o-2024-08-06', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_159d8341cc', usage=None)
2024-10-31 23:41:45.740052 ChatCompletionChunk(id='chatcmpl-AOZ5K2IpuZLOsSYDjPTDiEEl5S9Jg', choices=[Choice(delta=ChoiceDelta(content=None, function_call=None, role=None, tool_calls=None), finish_reason='tool_calls', index=0, logprobs=None)], created=1730418102, model='gpt-4o-2024-08-06', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_159d8341cc', usage=None)
2024-10-31 23:41:45.741480 end

I believe there’s an issue with the open ai service. If someone from openai can take a look us at Instacart would appreciate it.

Performance on a much larger input, with later also activating cache:

For 3 trials of gpt-4o @ 2024-10-31 04:50PM:

Stat Average Cold Minimum Maximum
stream rate Avg: 65.633 Cold: 68.6 Min: 60.3 Max: 68.6
latency (s) Avg: 0.486 Cold: 0.8023 Min: 0.3009 Max: 0.8023
total response (s) Avg: 2.428 Cold: 2.6549 Min: 2.2225 Max: 2.6549
total rate Avg: 52.991 Cold: 48.213 Min: 48.213 Max: 57.593
response tokens Avg: 128.000 Cold: 128 Min: 128 Max: 128

You already identified the inclusion of the tool call as the slowness, which I explained. You also have the retrieval time FROM the cache lookup of function after hashing.

Running your input, no cache, function strict. The instruction duplicated in system and user input.

For 5 trials of gpt-4o @ 2024-10-31 05:09PM:

Stat Average Cold Minimum Maximum
latency (s) Avg: 1.961 Cold: 2.1195 Min: 1.7586 Max: 2.138
total response (s) Avg: 1.976 Cold: 2.1351 Min: 1.7742 Max: 2.1601

Indeed, all you can do is hope that OpenAI lurkers who rarely respond would have a look, or you can send your latency report to “help”. Or pay your way out of the bottom half of tiers if that is the case.

Hi,
I don’t think you understand the issue is not latency, it’s that adding a tool call disables streaming, all the chunks show up at once. Compare my first post. When calling without tools chunks start coming after ~250ms, then arrive every 25ms, when calling with tools, nothing shows up for ~3s then everything shows up all at once.

I’m pretty sure this has nothing to do with us and is a bug in openAI’s api.

Thanks for your help! Happy halloween.

I understand, but yet, I couldn’t replicate that.

2024-10-31 16:14:55.979856 start
2024-10-31 16:14:58.277495 ChatCompletionChunk(id='chatcmpl-xyz', choices=[Choice(delta=ChoiceDelta(content=None, function_call=None, refusal=None, role='assistant', tool_calls=None), finish_reason=None, index=0, logprobs=None)], created=1730416496, model='gpt-4o-2024-08-06', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_159d8341cc', usage=None)
2024-10-31 16:14:58.293140 ChatCompletionChunk(id='chatcmpl-xyz', choices=[Choice(delta=ChoiceDelta(content=None, function_call=None, refusal=None, role=None, tool_calls=[ChoiceDeltaToolCall(index=0, id='call_lq1yuLiItpY7XBZTaLJv0me1', function=ChoiceDeltaToolCallFunction(arguments='', name='fake_tool'), type='function')]), finish_reason=None, index=0, logprobs=None)], created=1730416496, model='gpt-4o-2024-08-06', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_159d8341cc', usage=None)
2024-10-31 16:14:58.315287 ChatCompletionChunk(id='chatcmpl-xyz', choices=[Choice(delta=ChoiceDelta(content=None, function_call=None, refusal=None, role=None, tool_calls=[ChoiceDeltaToolCall(index=0, id=None, function=ChoiceDeltaToolCallFunction(arguments='{"pa', name=None), type=None)]), finish_reason=None, index=0, logprobs=None)], created=1730416496, model='gpt-4o-2024-08-06', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_159d8341cc', usage=None)
2024-10-31 16:14:58.330954 ChatCompletionChunk(id='chatcmpl-xyz', choices=[Choice(delta=ChoiceDelta(content=None, function_call=None, refusal=None, role=None, tool_calls=[ChoiceDeltaToolCall(index=0, id=None, function=ChoiceDeltaToolCallFunction(arguments='ram1"', name=None), type=None)]), finish_reason=None, index=0, logprobs=None)], created=1730416496, model='gpt-4o-2024-08-06', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_159d8341cc', usage=None)
2024-10-31 16:14:58.346580 ChatCompletionChunk(id='chatcmpl-xyz', choices=[Choice(delta=ChoiceDelta(content=None, function_call=None, refusal=None, role=None, tool_calls=[ChoiceDeltaToolCall(index=0, id=None, function=ChoiceDeltaToolCallFunction(arguments=': "tes', name=None), type=None)]), finish_reason=None, index=0, logprobs=None)], created=1730416496, model='gpt-4o-2024-08-06', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_159d8341cc', usage=None)
2024-10-31 16:14:58.362204 ChatCompletionChunk(id='chatcmpl-xyz', choices=[Choice(delta=ChoiceDelta(content=None, function_call=None, refusal=None, role=None, tool_calls=[ChoiceDeltaToolCall(index=0, id=None, function=ChoiceDeltaToolCallFunction(arguments='t1"}', name=None), type=None)]), finish_reason=None, index=0, logprobs=None)], created=1730416496, model='gpt-4o-2024-08-06', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_159d8341cc', usage=None)
2024-10-31 16:14:58.393452 ChatCompletionChunk(id='chatcmpl-xyz', choices=[Choice(delta=ChoiceDelta(content=None, function_call=None, refusal=None, role=None, tool_calls=[ChoiceDeltaToolCall(index=1, id='call_mHRyMCdOyogD1m5COZO5FLYA', function=ChoiceDeltaToolCallFunction(arguments='', name='fake_tool'), type='function')]), finish_reason=None, index=0, logprobs=None)], created=1730416496, model='gpt-4o-2024-08-06', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_159d8341cc', usage=None)
2024-10-31 16:14:58.415566 ChatCompletionChunk(id='chatcmpl-xyz', choices=[Choice(delta=ChoiceDelta(content=None, function_call=None, refusal=None, role=None, tool_calls=[ChoiceDeltaToolCall(index=1, id=None, function=ChoiceDeltaToolCallFunction(arguments='{"pa', name=None), type=None)]), finish_reason=None, index=0, logprobs=None)], created=1730416496, model='gpt-4o-2024-08-06', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_159d8341cc', usage=None)
2024-10-31 16:14:58.431235 ChatCompletionChunk(id='chatcmpl-xyz', choices=[Choice(delta=ChoiceDelta(content=None, function_call=None, refusal=None, role=None, tool_calls=[ChoiceDeltaToolCall(index=1, id=None, function=ChoiceDeltaToolCallFunction(arguments='ram1"', name=None), type=None)]), finish_reason=None, index=0, logprobs=None)], created=1730416496, model='gpt-4o-2024-08-06', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_159d8341cc', usage=None)
2024-10-31 16:14:58.446857 ChatCompletionChunk(id='chatcmpl-xyz', choices=[Choice(delta=ChoiceDelta(content=None, function_call=None, refusal=None, role=None, tool_calls=[ChoiceDeltaToolCall(index=1, id=None, function=ChoiceDeltaToolCallFunction(arguments=': "tes', name=None), type=None)]), finish_reason=None, index=0, logprobs=None)], created=1730416496, model='gpt-4o-2024-08-06', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_159d8341cc', usage=None)
2024-10-31 16:14:58.462445 ChatCompletionChunk(id='chatcmpl-xyz', choices=[Choice(delta=ChoiceDelta(content=None, function_call=None, refusal=None, role=None, tool_calls=[ChoiceDeltaToolCall(index=1, id=None, function=ChoiceDeltaToolCallFunction(arguments='t2"}', name=None), type=None)]), finish_reason=None, index=0, logprobs=None)], created=1730416496, model='gpt-4o-2024-08-06', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_159d8341cc', usage=None)
2024-10-31 16:14:58.478070 ChatCompletionChunk(id='chatcmpl-xyz', choices=[Choice(delta=ChoiceDelta(content=None, function_call=None, refusal=None, role=None, tool_calls=[ChoiceDeltaToolCall(index=2, id='call_w5HjitBkV7CIHCxdI8ZUkUNN', function=ChoiceDeltaToolCallFunction(arguments='', name='fake_tool'), type='function')]), finish_reason=None, index=0, logprobs=None)], created=1730416496, model='gpt-4o-2024-08-06', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_159d8341cc', usage=None)
2024-10-31 16:14:58.509356 ChatCompletionChunk(id='chatcmpl-xyz', choices=[Choice(delta=ChoiceDelta(content=None, function_call=None, refusal=None, role=None, tool_calls=[ChoiceDeltaToolCall(index=2, id=None, function=ChoiceDeltaToolCallFunction(arguments='{"pa', name=None), type=None)]), finish_reason=None, index=0, logprobs=None)], created=1730416496, model='gpt-4o-2024-08-06', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_159d8341cc', usage=None)
2024-10-31 16:14:58.515897 ChatCompletionChunk(id='chatcmpl-xyz', choices=[Choice(delta=ChoiceDelta(content=None, function_call=None, refusal=None, role=None, tool_calls=[ChoiceDeltaToolCall(index=2, id=None, function=ChoiceDeltaToolCallFunction(arguments='ram1"', name=None), type=None)]), finish_reason=None, index=0, logprobs=None)], created=1730416496, model='gpt-4o-2024-08-06', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_159d8341cc', usage=None)
2024-10-31 16:14:58.547189 ChatCompletionChunk(id='chatcmpl-xyz', choices=[Choice(delta=ChoiceDelta(content=None, function_call=None, refusal=None, role=None, tool_calls=[ChoiceDeltaToolCall(index=2, id=None, function=ChoiceDeltaToolCallFunction(arguments=': "tes', name=None), type=None)]), finish_reason=None, index=0, logprobs=None)], created=1730416496, model='gpt-4o-2024-08-06', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_159d8341cc', usage=None)
2024-10-31 16:14:58.562816 ChatCompletionChunk(id='chatcmpl-xyz', choices=[Choice(delta=ChoiceDelta(content=None, function_call=None, refusal=None, role=None, tool_calls=[ChoiceDeltaToolCall(index=2, id=None, function=ChoiceDeltaToolCallFunction(arguments='t3"}', name=None), type=None)]), finish_reason=None, index=0, logprobs=None)], created=1730416496, model='gpt-4o-2024-08-06', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_159d8341cc', usage=None)
2024-10-31 16:14:58.578424 ChatCompletionChunk(id='chatcmpl-xyz', choices=[Choice(delta=ChoiceDelta(content=None, function_call=None, refusal=None, role=None, tool_calls=[ChoiceDeltaToolCall(index=3, id='call_menoLRSRMevO9rfqzokNvzSi', function=ChoiceDeltaToolCallFunction(arguments='', name='fake_tool'), type='function')]), finish_reason=None, index=0, logprobs=None)], created=1730416496, model='gpt-4o-2024-08-06', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_159d8341cc', usage=None)
2024-10-31 16:14:58.594064 ChatCompletionChunk(id='chatcmpl-xyz', choices=[Choice(delta=ChoiceDelta(content=None, function_call=None, refusal=None, role=None, tool_calls=[ChoiceDeltaToolCall(index=3, id=None, function=ChoiceDeltaToolCallFunction(arguments='{"pa', name=None), type=None)]), finish_reason=None, index=0, logprobs=None)], created=1730416496, model='gpt-4o-2024-08-06', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_159d8341cc', usage=None)
2024-10-31 16:14:58.616176 ChatCompletionChunk(id='chatcmpl-xyz', choices=[Choice(delta=ChoiceDelta(content=None, function_call=None, refusal=None, role=None, tool_calls=[ChoiceDeltaToolCall(index=3, id=None, function=ChoiceDeltaToolCallFunction(arguments='ram1"', name=None), type=None)]), finish_reason=None, index=0, logprobs=None)], created=1730416496, model='gpt-4o-2024-08-06', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_159d8341cc', usage=None)
2024-10-31 16:14:58.631845 ChatCompletionChunk(id='chatcmpl-xyz', choices=[Choice(delta=ChoiceDelta(content=None, function_call=None, refusal=None, role=None, tool_calls=[ChoiceDeltaToolCall(index=3, id=None, function=ChoiceDeltaToolCallFunction(arguments=': "tes', name=None), type=None)]), finish_reason=None, index=0, logprobs=None)], created=1730416496, model='gpt-4o-2024-08-06', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_159d8341cc', usage=None)
2024-10-31 16:14:58.663094 ChatCompletionChunk(id='chatcmpl-xyz', choices=[Choice(delta=ChoiceDelta(content=None, function_call=None, refusal=None, role=None, tool_calls=[ChoiceDeltaToolCall(index=3, id=None, function=ChoiceDeltaToolCallFunction(arguments='t4"}', name=None), type=None)]), finish_reason=None, index=0, logprobs=None)], created=1730416496, model='gpt-4o-2024-08-06', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_159d8341cc', usage=None)
2024-10-31 16:14:58.678707 ChatCompletionChunk(id='chatcmpl-xyz', choices=[Choice(delta=ChoiceDelta(content=None, function_call=None, refusal=None, role=None, tool_calls=[ChoiceDeltaToolCall(index=4, id='call_otMWKISXZ1bS1K9nw5txyADE', function=ChoiceDeltaToolCallFunction(arguments='', name='fake_tool'), type='function')]), finish_reason=None, index=0, logprobs=None)], created=1730416496, model='gpt-4o-2024-08-06', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_159d8341cc', usage=None)
2024-10-31 16:14:58.694328 ChatCompletionChunk(id='chatcmpl-xyz', choices=[Choice(delta=ChoiceDelta(content=None, function_call=None, refusal=None, role=None, tool_calls=[ChoiceDeltaToolCall(index=4, id=None, function=ChoiceDeltaToolCallFunction(arguments='{"pa', name=None), type=None)]), finish_reason=None, index=0, logprobs=None)], created=1730416496, model='gpt-4o-2024-08-06', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_159d8341cc', usage=None)
2024-10-31 16:14:58.716474 ChatCompletionChunk(id='chatcmpl-xyz', choices=[Choice(delta=ChoiceDelta(content=None, function_call=None, refusal=None, role=None, tool_calls=[ChoiceDeltaToolCall(index=4, id=None, function=ChoiceDeltaToolCallFunction(arguments='ram1"', name=None), type=None)]), finish_reason=None, index=0, logprobs=None)], created=1730416496, model='gpt-4o-2024-08-06', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_159d8341cc', usage=None)
2024-10-31 16:14:58.732126 ChatCompletionChunk(id='chatcmpl-xyz', choices=[Choice(delta=ChoiceDelta(content=None, function_call=None, refusal=None, role=None, tool_calls=[ChoiceDeltaToolCall(index=4, id=None, function=ChoiceDeltaToolCallFunction(arguments=': "tes', name=None), type=None)]), finish_reason=None, index=0, logprobs=None)], created=1730416496, model='gpt-4o-2024-08-06', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_159d8341cc', usage=None)
2024-10-31 16:14:58.747765 ChatCompletionChunk(id='chatcmpl-xyz', choices=[Choice(delta=ChoiceDelta(content=None, function_call=None, refusal=None, role=None, tool_calls=[ChoiceDeltaToolCall(index=4, id=None, function=ChoiceDeltaToolCallFunction(arguments='t5"}', name=None), type=None)]), finish_reason=None, index=0, logprobs=None)], created=1730416496, model='gpt-4o-2024-08-06', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_159d8341cc', usage=None)
2024-10-31 16:14:58.763388 ChatCompletionChunk(id='chatcmpl-xyz', choices=[Choice(delta=ChoiceDelta(content=None, function_call=None, refusal=None, role=None, tool_calls=None), finish_reason='tool_calls', index=0, logprobs=None)], created=1730416496, model='gpt-4o-2024-08-06', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_159d8341cc', usage=None)
2024-10-31 16:14:58.779002 end

Solace: Half a second of not seeing a tool call isn’t really important when you have to receive it all to act anyway.

Python 3.11, and this post’s “pip”, will bring you to the latest release module versions indicated by openai. Then it is down to the platforms.

Hi,
Look at your first two logs, you are replicating it. Nothing shows up for 3s, then everything get’s dumped very quickly. Try changing the prompt to asking the LLM to call the tool 100 times and you’ll see that first event takes forever.

You don’t have to receive it all to act :slight_smile:. Once accumulating the first tool call you can act while you wait for the next one. This is how my application works.

Also, I did update my deps.

Thanks for all the help. I believe I’ve found a legitimate bug that needs fixed. I just want this to get escalated.

I expect that I am receiving output tokens at the rate they are being actually produced by the model, at similar output rate as other language calls after an initial delay that is longer. The “function” delay.

This is due to functions and the API parser having the new structured function schema available. It is a concern that has several forum topics, as structured outputs underperforms due to the additional computation.

From its announcement:

Note: the first request you make with any schema will have additional latency as our API processes the schema, but subsequent requests with the same schema will not have additional latency.

It would seem it has additional burden of precomputation on ANY calls, that is not explicitly mentioned, but has continued since introduction.

One can imagine that even for repetitions, the “hash input function object tokens” → “validate against model and schema to see if strict” → “search artifact database” → “return cache hit results” → “load tokenizer grammar” process has overhead that is more dramatic on smaller requests.

Hopefully there are big brains on the task of shaving off another second or two.

2 Likes