Function calling responds in parallel but differently in tool call and in message thread

_sean · February 10, 2024, 6:27pm

when i respond to a function call by submitting tool outputs, the model returns two different responses:

one in the tool call response in the function call on the run
one in the message list from the thread

these two responses are different, and even attempting to use one vs. the other moving forward is inconsistent since sometimes each answer has different details that are useful to the continuing conversation. i am unsure whether to respond by submitting tool outputs or responding to the message thread

if i respond to the message thread, should i kill the run? if so, it feels like i’m missing something obvious about the parallel processes of run lifecycles vs. the continuing thread. i can apparently have a back-and-forth conversation by just submitting tool outputs while similar but diverging messages from the model just appear on the thread even though i’m not responding directly to the thread

jlvanhulst · February 10, 2024, 10:47pm

Not sure I got exactly what you’re saying but the tool_call response would just be an acknowledgement your functions call repsonses are accepted. In the message list I would expect a regular GPT ‘response’ based on the way it has processed the function calls. Is that not what you mean?

_sean · February 11, 2024, 6:40am

hey @jlvanhulst i appreciate you responding. perhaps i am misunderstanding the intended use case of function calling. my understanding is that it is used to ensure the assistants api returns the properly formatted response for use in external functions.

ideally, standard messages in the thread would simply have this format assurance, and then the parallel function calling would not be needed, but as it is both operate simultaneously. the problem occurs when the message in the thread contains a better response than the tool calls output, but the message is not in a format predictably parsable into the needed structure, so only the inferior response is usable.

sometimes the message in the thread has additional information that is referred to by subsequent responses, but if it’s not usable, the needed context is missing in external systems that use the output.

for example, one use case i have is creating a dynamic story with the model creating a narrative, asking the user questions, and continuing the story based on the user’s response. i need the response in a specific, predictable format for use elsewhere, but sometimes the message has better narrative content than the tool calls output.

sometimes the message even includes additional information that is not in the tool calls output, but is referenced by the next response from the model, which is really confusing for the user. so if we’re at point A in our story, we have two outputs: A1 and A2 which have different amounts of narrative information. if i choose output A2 from the tool calls output, when the user gets to point B in the story, the model references information in A1 that was not in A2.

at this point the assistants api is not usable for my use case so i am using standard chat completion and just uploading all the context in each call. for assistants threads to be usable, either the divergent issue would need to be solved, or json format added to the message output.

_j · February 11, 2024, 6:51am

Perhaps so.

A tool (only functions are allowed) is an external access method to a function of your code that the AI may employ.

A function might be something that informs the AI, like a calculator, a database, or a web search.

A function might be something like an action or trigger, like posting a tweet, or even hanging up on the user.

What it is not, especially within assistants, is a way to get a response in a particular format.

For that, you would just specify to the AI the output that you want written and the purpose. You can be quite robust in your description, such as providing a full json schema with descriptions and examples as the response specification (such as could also validate the AI output.)

_sean · February 11, 2024, 7:30am

how is a function to be properly called with the correct inputs if the response is not in a particular format?

currently the json_object response_format is not available in the assistants api. it is only available in the chat completions api.

the standard message response is unpredictable in its response format without that feature. even 10% of responses being incorrectly formatted makes a product unusable for users.

_j · February 11, 2024, 8:33am

The tool call mechanism of gpt-4-turbo models uses the response_format json technique. Which actually can be problematic.

The output of functions is reliable because you create a specification in a particular format, and the AI has extensive fine-tune training on how to write and send functions.

And one can control any unpredictability otherwise when not using assistants but rather chat completions where you can control the sampling parameters. If you said “assistants is not the best way to do X”, for the largest subset of tasks, you’d be right.

Topic		Replies	Views
Handling Structured Output in Function Tool Calls with `file_search` Community function-calling , structured-output	6	300	February 18, 2025
Missing a message during function call - openai assistant API assistants-api	9	278	December 13, 2024
Function calling response format API api	12	2043	October 3, 2024
How can I use function calling with response format (structured output feature) for final response? Feedback gpt-4 , assistants-api	11	5195	May 30, 2025
Formatting Assistant Messages After Tool Function Calls in GPT Conversations API	2	16798	December 5, 2023

Function calling responds in parallel but differently in tool call and in message thread

Related topics