Help for function calls with streaming

_j · February 16, 2024, 12:18am

What I posted is full code for an API request to the openai python library to get an AI response from a model. params that are accepted by the chat.completions function you would write in python dictionary format (which looks like json key/value)

Here a linear example in another topic of tool-enabled code which builds the params input to send to the AI, which is like the chat history a chatbot loop would be building to give the AI a memory of past user input and tool calls and returns to the AI:

Here’s an explanation of the code I gave (where the function definition must come first in the py file.

The code does the following:

c = client.chat.completions.with_raw_response.create(**params)

This line makes a request to the OpenAI API. **params is used to unfurl the dictionary of parameters to the API request. They are the normal parameters like “model”=“gpt-3.5-turbo”, but the input is a dictionary, so the params dictionary looks more like a raw JSON API request, with “model”: “gpt-3.5-turbo” (note the colon).

The request is made to OpenAI’s Chat Model API endpoint at client.chat.completions. The with_raw_response.create method indicates that the response is from the httpx library within, and includes additional information like headers, and should not be parsed and should be left in its raw, JSON-like format when using certain httpx methods on that return. c.

reply=""
tools=[]
for chunk in c.parse():

This initiates a loop through the response from the API call, parsing the raw response into a more usable Python object using the parse() function which returns an iterable (generator that emits the network chunks as they are received). The response from the API comes in “chunks” to allow the processing of data in a streaming manner.

print(chunk.choices[0].delta)

Just for diagnosis so you can see more of what is being received over the network, this line prints out the first choice (where “choice” is because you can request the AI answer the same input multiple times for choices of responses using n: 2 or more, rarely used) in each chunk that the streaming API sends. Each choice has an associated ‘delta’ object which considers the changes addeds between the previous chunks and the current chunk in the stream.

if chunk.choices[0].delta.content:
    reply += chunk.choices[0].delta.content      
    print(chunk.choices[0].delta.content, end="")

If the delta has a content field (which includes the assistant’s reply almost token-by-token), it’s added to the reply string, and is also printed out.

if chunk.choices[0].delta.tool_calls:
    tools += chunk.choices[0].delta.tool_calls

If there are any tool_calls in the delta, (like calls to Open AI’s system-level tools), they’re added to the tools list for later processing. Each chunk has a complex collection of the parts of a function, where only the first chunk of a function has its ID, where continued chunks still have a full object and not just the additional text of a tool.

from collections import defaultdict

def tool_list_to_tool_obj(tools):
    ...

This function, where its def would appear earlier in the code, converts the list of streams objects extracted from chunks for tools calls into a single object representation. If a tool call sends an argument in one chunk and then sends more in a subsequent chunk, they are all gathered and associated under the same tool id using a defaultdict. Once all chunks have been processed, it produces a dict of tool details (not unlike what would be returned from the non-streaming OpenAI API).

Finally,

tools_obj = tool_list_to_tool_obj(tools)
print(reply)
print(tools_obj)

We use the function to make the non-streaming version of the tool call object.

The reply string and the objectified tool calls dict are both printed as a demonstration of what information has been gathered. Remember: the AI “content” to the user was already printed token-by-token to the user (and you substitute your “printing” method there within the loop to receive chunks).

The variables that were set for reply and tools_obj are now available for use in your code as before: You must use your existing parsing, now performing the functions (which can be multiple and parallel) to send each one back to the AI (see the earlier linked topic for doing this too)

In summary, the script I provided is designed to communicate with the OpenAI API, receive responses in a streaming manner, and handle chunks of data that are parts of either dialogue (in the content) or system-level tool invocations (tool_calls). The chunks are pieced together appropriately to form complete dialogue or tool invocations.

Writing a chatbot (link with even more I wrote) can be a simple loop of input. send with history, parse response, and send back the fulfilled tool return instead of asking the user a new question.

If you are still at a “I need programming lessons” stage after this, ChatGPT plus is $20/mo and can answer when you have the expertise to know what to ask.

Topic		Replies	Views
Multiple function calls with streaming API gpt-4 , function-calling , streaming	6	3962	April 5, 2024
Stream API Function Calling Unexpected Behavior API function-calling , gpt-4-turbo	0	832	November 9, 2023
GPT4o using tools + streaming? API chatgpt , api	1	2304	May 27, 2024
Has anyone managed to get a tool_call working when stream=True? API api , function-calling	22	16639	May 24, 2024
Streaming while using Agent/Tools/Function calling in NodeJS API gpt-4 , api	0	1857	February 25, 2024

Help for function calls with streaming

Related topics