I had mentioned the playground to say that it was easy to demonstrate the problem, but it first appeared in the API in production for us.
To put it simply, it’s “no longer possible” to use the “json_object” response format with calling functions on top.
I’m talking about the json_object, not the json_schema. However, this happened on the May model (and August model thereafter) out of the blue, just after the announcement of the new json_schema format for structured output.
What I’m pointing out is simply that this used to work very well, still works very well with GPT4 for example, but no longer with GPT4o, which then calls the possible functions in a totally incoherent way.
Just so you know, this is the case with or without parallel mode.
Another piece of information: the function call is not set to “required” precisely because I had external tools but was waiting for a JSON response.
I don’t think this is necessarily directly due to the LLM model, but to a conflict on a post-processing layer.
To get out of it, I switched back to plain text mode and validated the JSON output myself, which finally worked.
It’s just an observed and very real regression.
Also, I haven’t found anywhere in the documentation, especially since the introduction of structured outputs, a note to the effect that “json_object” is now incompatible with the use of functions.
If this is to be the case from now on, I think it would be necessary to return an error when a request uses calling functions (via tools) and whose output format is set to “json_object”.
I totally agree with you on this, and I want to note that while we’re not currently seeing a high occurrence of this error (the looping tool calls when using a model with json_object
response type), I have seen some instances on our gpt-4-turbo implementation as well as observed the error rate to be basically 100% when using 4o/4o-mini and wanted to get ahead of it since we are looking to bump our model version.
I agree that the API should be updated to reflect the incompatibility, or at minimum the docs updated to show that this can/may happen with json_object
response types and function calling.
Hi OAI staff here. Sorry it was indeed a bug on our end that caused this.
To provide more context: this bug was not caused by the new structured outputs feature directly. We made a change on 8/28 aiming to improve function calling reliability by preventing hallucination for tools, but introduced this model behavior bug.
I’ll work on a fix soon and keep you guys posted.
This has now been fixed, please try again and let me know if anything is not working as expected.
Thank you @brianz-oai, will take a look first thing in the AM and report back!
Thank you for the update @brianz-oai - It seems the issue is resolved on our end.
We’ve noticed this strange behavior for a few weeks now and were wondering why it’s happening with models released in May and early August. Are there any updates you release that aren’t related to the model but still impact the results?
double-checked this just now in the playground (where I was able to replicate the issue consistently) and have confirmed that it is now behaving as expected.
many thanks again @brianz-oai!