GPT-4-0125-preview hallucinating tool calls


We’re developing AI agents that utilize a variety of tools, with each agent handling approximately 10 tools. These tools can be called either sequentially or in parallel.

We quite recently migrated from the deprecated function calling model into tools and are now noticing that it often hallucinates which tools we have given the agent (the “tools” parameter).

To compensate for this we are doing automatic retries if it is calling a tool that does not exist. The way we’ve implemented this is that it returns a message saying that the tool does not exist. This message has role="tool" and the corresponding tool_call_id set, matching the ID from the assistant message. See example below:

  "tool_call_id": "call_uO4jmE3AXVNPFJnsNHidpAEx",
  "content": "Could not find tool with name multi_tool_use.parallel",
  "role": "tool"

However, this solution has led to a new problem. When the API receives this response, it returns a 400 error with the following message:

400 'multi_tool_use.parallel' does not match '^[a-zA-Z0-9_-]{1,64}$' - ''

Anyone else have the same experience or have any tips to how we can work around this issue?

Besides the namespace tool injection of function specification, with tools, the AI also gets this big waste of new tokens whether you want it or not:

## multi_tool_use

// This tool serves as a wrapper for utilizing multiple tools. Each tool that can be used must be specified in the tool sections. Only tools in the functions namespace are permitted.
// Ensure that the parameters provided to each tool are valid according to that tool's specification.
namespace multi_tool_use {

// Use this function to run multiple tools simultaneously, but only if they can operate in parallel. Do this even if the prompt suggests using the tools sequentially.
type parallel = (_: {
// The tools to be executed in parallel. NOTE: only functions tools are permitted
tool_uses: {
// The name of the tool to use. The format should either be just the name of the tool, or in the format namespace.function_name for plugin and function tools.
recipient_name: string,
// The parameters to pass to the tool. Ensure these are valid according to the tool's own specifications.
parameters: object,
}) => any;

} // namespace multi_tool_use

Your tool calls can be wrapped in a new recipient’s container for what is externally called “parallel tool calls”.

Injecting language about how the AI operates on API developers is the first step of betrayal.

Especially when the AI model can’t use it right, as is seen all over the forum. And something is going on now, especially with 1106, where multitool is being called even when there is no function for the AI to write.

I suspect that the AI is emitting this multi-tool even when there is no backend API recipient. It starts writing to= multitool because of the weighting or internal token production modification, and then there is no way for it to abort and return to not emitting a function. It gets the message that the recipient doesn’t exist, and then the AI is so messed up, it still can’t write to the user normally.

Hit this today as well. Hopefully they fix it in the next GPT-4-Turbo version they release. I’d guess it’s related to some outdated fine-tuning they did for enabling parallel function calling.

Hit this today as well. Hopefully it’s some intermittent issue that would be fixed by OpenAI soon?