Gpt-4o vs. gpt-4-turbo - function calling

Hi all,

Just want to know that I’m not alone.

I use gpt-4-turbo for text generation (pretty simple task). The output is a JSON object through function calling (for output stability). I tried the same prompt with gpt-4o, but it seems to ignore the instruction to use function calling. Compared to gpt-4-turbo, which uses function calling in 95% of cases, gpt-4o doesn’t use it at all. It just returns the JSON object in the content.

Anyone else experience this?

5 Likes

I’m using {“tool_choice”: “required”} to guarantee it does choose a tool (Pydantic response model).

HOWEVER, I noticed 4o does not consistently respect the JSON schema. For example. For example, if it chooses to return markdown, it will completely ignore the required json schema. GPT-4 Turbo on the other hand does this very consistently, even when returning markdown, it will return the markdown content in the required JSON field.

1 Like

same here gpt-4o does not follow JSON schema reliably.

You are not alone. However, it is very strange because yesterday I had to make a slight prompt change for our bot to start working and it worked flawlessly all night. Then woke up this morning and it wont call the correct function no matter what we do.

We are experiencing similar issues. The model is ignoring the required fields defined in the tools definition and calling functions with missing info.

I’ve also noticed that if the function returned an error, it ignores that and continues to next instruction step as if it got success as output

1 Like

Same problem for me. gpt-4o does use the tool I give it, but it uses it incorrectly. Here are the exact steps to reproduce. I give the model the following tool:

{
    "type": "function",
    "function": {
        "name": "execute_python_code",
        "description": "Executes a python program, returning the standard output",
        "type": "Object",
        "properties": {
            "code": {
                "type": "string",
                "description": "A python program with print statements"
            }
        },
        "required": "code"
    }
}

And I give the model a task that requires the tool:

"messages": [
    {
      "role": "system",
      "content": "Use the provided tool to perform calculations for the user."  
    },
    {
        "role": "user",
        "content": "What's 1.23^4?"
    }
]

gpt-4-turbo-preview uses the tool correctly (as does gpt-3.5-turbo):

"function": {
    "name": "execute_python_code",
    "arguments": "{\"code\":\"result = 1.23**4\\nprint(result)\"}"
}

However, gpt-4o fails to supply the arguments in JSON form, and instead just dumps the code:

"function": {
    "name": "execute_python_code",
    "arguments": "base = 1.23\nexponent = 4\nresult = base ** exponent\nresult"
}

This feels like a fairly run-of-the-mill use case – is there anything I’m doing wrong? If not, I’m tempted to say this is a bug with gpt-4o.

I’m having similar issues. GPT-4o keeps trying to call a function instead of just responding in a langgraph group chat. The agent that keeps failing only has one tool (that it is using correctly) but then calling non existent tools.

I’ve tried playing with prompts all over the place to resolve this but it consistently is messing up.