Recently I’ve been running into an issue with 4o mini (might happen with other models) where occasionally, instead of using a proper tool call, it will respond with the JSON schema in a normal assistant message. Has anyone seen this or know a potential cause?
Hi. I suspect that this can come about when the AI has the intention, but not the skill.
For you see, the AI model has to emit a special sequence in order to properly trigger and send to the tool recipient. It might skip over this step, and then still have the mindset of tool output. Proper use can only come about by model post-training on functions done by OpenAI.
Besides random occurrences, I’ve seen reproducible cases of the AI unable to stop itself from invoking functions from a conversational user input, despite the tools then purposefully having absolutely no utility.
There are a few causes to consider:
-
Does the tool have your description placed within counter to the actual operation? Telling the AI how to call the tool, instead of simply what it will return and why it is useful, could have such an effect, as well as instructing tool use in system instructions.
-
The occasional nature may be from not lowering temperature or top_p from defaults. This high-perplexity model doesn’t need temperature to be “creative”.
-
On large input contexts with a large amount of data put between system message and its tools and your final user input, there may loss of attention, and then regaining it during generation.
The control one might wish for, promoting the initial token of function output with logit_bias, is not allowed or even discussed in its nature.
I would start at sampling parameters, the easiest for you. See if top_p:0.1 repairs those bad outputs being presented to the user. Hopefully you will see results, making other strategies a second priority.
I’ve been thinking it may have something to do with the instructions in the system message. I’ll check there first, thanks.