How to use Streaming only on final step, no streaming when retrieving the function name

functions:
“functions”: [
{
“parameters”: {
“type”: “object”,
“properties”: {
“region”: {
“type”: “string”
}
},
“required”: [
“region”
]
},
“name”: “get_weather”,
“description”: “Get weather”
},
{
“parameters”: {
“type”: “object”,
“properties”: {
“region”: {
“type”: “string”
}
},
“required”: [
“region”
]
},
“name”: “get_coordinate”,
“description”: “Get coordinates”
},
{
“parameters”: {
“type”: “object”,
“properties”: {
“region”: {
“type”: “string”
}
},
“required”: [
“region”
]
},
“name”: “get_population”,
“description”: “Get population”
}
]

First messages input:
[{
“role”: “user”,
“content”: “Coordinates and weather in Beijing”
}]
First response:
{
“role”: “assistant”,
“content”: null,
“function_call”: {
“name”: “get_coordinate”,
“arguments”: “{\n"region”: “Beijing”\n}"
}
}

Second messages input:
[{
“role”: “user”,
“content”: “Coordinates and weather in Beijing”
},
{
“role”: “assistant”,
“content”: null,
“function_call”: {
“name”: “get_coordinate”,
“arguments”: “{\n"region”: “Beijing”\n}"
}
},
{
“role”: “function”,
“name”: “get_coordinate”,
“content”: “{“longitude”:“115°25’ to 117°30’”,“latitude”:“39°26’ to 41°03’”}”
}]

Second response:
{
“role”: “assistant”,
“content”: null,
“function_call”: {
“name”: “get_weather”,
“arguments”: “{\n"region”: “Beijing”\n}"
}
}

Third messages input:
[
{
“role”: “user”,
“content”: “Coordinates and weather in Beijing”
},
{
“role”: “assistant”,
“content”: null,
“function_call”: {
“name”: “get_coordinate”,
“arguments”: “{\n"region”: “Beijing”\n}"
}
},
{
“role”: “function”,
“name”: “get_coordinate”,
“content”: “{“longitude”:“115°25’ to 117°30’”,“latitude”:“39°26’ to 41°03’”}”
},
{
“role”: “assistant”,
“content”: null,
“function_call”: {
“name”: “get_weather”,
“arguments”: “{\n"region”: “Beijing”\n}"
}
},
{
“role”: “function”,
“name”: “get_weather”,
“content”: “{“weather”:“Heavy rain”}”
}
]

Third response:
{
“role”: “assistant”,
“content”: “The coordinates of Beijing are longitude 115°25’ to 117°30’, latitude 39°26’ to 41°03’. The current weather is heavy rain.”
}

During the interaction, to avoid making the user wait when answering their question, we pass the parameter “stream”: true when calling the GPT API. GPT then streams the response to us, and we relay this stream to the frontend. However, when retrieving the function name based on the user’s question, we do not need to use the streaming method. In this process, “stream”: false should be set.

We are currently unable to obtain a SIGNAL indicating that there is no matching function available. When there is no matching function, GPT provides an answer to the question. If we wait at this step and send the answer to the user, the response time is uncertain, which is not user-friendly. Therefore, we would like to use the streaming method to provide the final result to the user, but without using the streaming method when retrieving the function name.

If the calling of a function is up to the AI, then there is no way to predict whether it will call a function or whether it will provide a human-readable answer, and in fact, it can do both.

The streaming does not take any longer to complete, so that is not a reason for preferring it.

You will have handle both the case of content to user AND a function call, as both can be present. Examine the output below, where I wrote a prompt to have AI both explain a word problem, and then also solve it via function call:

"choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "To measure the height of a tree when given the distance to the tree and the viewpoint angle, you can use trigonometry. The following steps outline the process:\n\n1. Stand at a known distance from the tree. In this case, the distance is given as 151.33 meters.\n2. Measure the angle between your line of sight and the ground (viewpoint angle). In this case, the viewpoint angle is given as 22.35 degrees.\n3. Identify the right triangle formed by the tree, your position, and the ground.\n4. The height of the tree can be calculated using the tangent function:\n\n   height = distance * tan(viewpoint angle)\n\nNow, let's calculate the height of the tree using the given values:",
        "function_call": {
          "name": "python",
          "arguments": "import math\n\ndistance = 151.33  # meters\nviewpoint_angle = math.radians(22.35)  # convert to radians\n\nheight = distance * math.tan(viewpoint_angle)\nheight"
        }
      },
      "finish_reason": "function_call"
    }

It is easy to select streaming when you are also returning an API’s function role message, but not streaming when a function call is only a possibility will result in that AI response to the user being delayed.