When GPT-4o tries to call a custom function, it doesn’t actually call any function. Instead, it prints the function call in the normal assistant response.
Sometimes, it works fine. Most times, it prints the function call and any arguments that go with it, right in the assistant response. It makes this model unusable when you need custom functions to work. Not to mention confuses the hell out of users.
What can I do? It doesn’t seem to matter how I word the function description. GPT-4-turbo, GPT-4, and GPT-3.5-turbo all work fine with customer functions in my code.
As you setting the function_call parameter to force it to use a given function for chat completions, or the equivalent tool_choice param for thread runs?
I’m having the exact opposite issue. GPT-4o keeps trying to call a non existent function instead of just responding in a langgraph group chat. The agent that keeps failing only has one tool (that it is using correctly) but then calling non existent tool.
I’ve tried playing with prompts all over the place to resolve this but it consistently is messing up.
I’m using chat completion and it’s correctly returning a function call, but I’m definitely seeing worse performance than gpt-4. It’s generally making worse decisions, not obeying enums as well as previously, and often calling only one function where multiple are needed.
I agree and find that function calling in GPt-4o is unusable when compared to GPT-4 Turbo. In some cases I am seeing the same function being called 2 or 3 times with parameters being correct for 1 and completely made up for the others. Also seeing function calls returned in the assistant response. When instructed in the system message to only call a single function, multiple functions are often called.
, as if it is calling a JS function.
My (very temporary) solution is to manually parse this string to retrieve function name and arguments. Here’s the code (node.js + typescript + langchain):
import type { AIMessage } from '@langchain/core/messages';
import jsonc from 'jsonc-parser';
/**
* Maunally parse and rectify a faulty tool calling response from gpt-4o.
* @param {AIMessage} msg
* @returns {void}
*/
export function patchFautyFunctionCall(msg: AIMessage): void {
if (!msg.content || typeof msg.content !== 'string') {
return;
}
const matchResult = msg.content.match(/functions\.(.*?)[\(\n]/);
if (!matchResult) {
return;
}
// The string that comes after functions.${function_name}(
// If the argument is an object, gpt-4o sometimes omits double quotes around attribute names,
// making it an invalid JSON string. quotifyJSONString adds the missing quotation marks.
const jsonString = quotifyJSONString(msg.content.substring((matchResult.index ?? 0) + matchResult[0].length));
const [jsonObj, prefixLen] = parseJSONPrefix(jsonString);
if (prefixLen === 0) {
return;
}
msg.content = jsonString.substring(prefixLen);
if (msg.tool_calls === undefined) {
msg.tool_calls = [];
}
msg.tool_calls.push({
name: matchResult[1],
args: jsonObj,
});
}
function quotifyJSONString(unquotedJson: string): string {
const attributePattern = /([{,]\s*)([a-zA-Z_][a-zA-Z0-9_]*)(\s*:)/g;
// Replace unquoted attribute names with quoted ones
return unquotedJson.replace(attributePattern, '$1"$2"$3');
}
/**
* Try to parse the prefix of a JSON string into an object.
* @param {string} str A string that might have a valid JSON prefix.
* @returns {[any, number]} [The parsed object, size of the valid JSON prefix]
*/
function parseJSONPrefix(str: string): [any, number] {
const errors: jsonc.ParseError[] = [];
const obj = jsonc.parse(str, errors);
if (errors.length === 0) {
return [obj, str.length];
}
if (errors[0].offset === 0) {
// No valid prefix
return [undefined, 0];
}
return [obj, errors[0].offset];
}
Hi @cbarber713, OAI staff here, and sorry for the late reply. I’d like to get more details on your use case. It sounds like you were using:
assistants API v2
with tools and tool_choice param
My questions is: what value did you pass to tool_choice? Is it auto, required or a specific function like {"type": "function", "function": "your_func_name"}
Any information that can help me reproduce this bug is highly appreciated.
@turbolucius, @voidptr_t, @mahnoorrana.dev
if any of you can provide details that would help me reproduce this bug, it would help me figure out what’s going on faster.
@voidptr_t Thanks for providing the example. I did some digging today and my early hypothesis is that gpt-4o likely requires more explicit and accurate instructions than 4-turbo for function calling.
Since the main issue with your example is that the model chose to output a user message (with the function call in javascript syntax) instead of returning tool_calls, I changed
If you feel they are expecting a response from you, output your response.
to
If you feel they are expecting a response from you, output your response by using a tool.
in the system prompt.
I ran your example 1000 times in a script and noticed that the issue went away (i.e. gpt-4o calling the function similar amount of time as gpt-4-turbo).
Feel free to give it a try and let me know how it works for you. We will look deeper into this and will share more findings or good practices soon.
@cbarber713 Please see my above comment. I’d suggest you play with your system prompt, trying to be as explicit and specific as possible on when you expect the model to use the tool. I’m curious to hear if that helps with your use case.
If you are willing to share some concrete examples of yours, I’m happy to take a look on my side too.
Sorry for the late reply. I went out of town and got sick and was stuck. Bit of a nightmare but anyways…
I am using auto for tool_choice as I need the AI to decide on it’s own when to call the function. For this same reason I cannot really tell it to use a tool to respond every time in the system prompt. However, I did try this just as a test but the issue still remains. The AI produces a text response with the function call syntax shown. It seems that no amount of manipulating the system prompt will fix this.
One example is that we have a contact form that the AI can show to the user via a custom function. The form should only be shown when appropriate based on the conversation. I’ve played with the system prompt and the instructions in the custom function. But nothing seems to help. gpt-4-turbo does not have this issue at all.