I’m making API calls to the GPT-4 Chat API, where I explicitly state I want a JSON response. I’ve added a functions object that explicitly states how the JSON response should look. I’ve used both the legacy function_call and tools_choice options to force OpenAI to use the defined function.
However, a significant amount of time I find the the response abruptly deviates from what is expected, and the rest of the token limit is filled with a stream of text repeating \t\n \t\n \t\n \t\n \t\n \t\n \t\n \t\n \t\n \t\n… to fill the whole response.
This only appears to happen when I force a function call response in JSON format. This doesn’t always happen, but approx 20% of the time it does. The longer the expected response, the more likely it is to happen.
Can anyone advise on how I can mitigate this? It appears to be a bug. I note that the API docs states the following, however I clearly state in my prompt multiple times that I expect a JSON output:
Important: when using JSON mode, you must also instruct the model to produce JSON yourself via a system or user message. Without this, the model may generate an unending stream of whitespace until the generation reaches the token limit, resulting in a long-running and seemingly “stuck” request. Also note that the message content may be partially cut off if finish_reason="length", which indicates the generation exceeded max_tokens or the conversation exceeded the max context length.
Can you give us some idea of what you are prompting?
Generally, if the input isn’t satisfactory, and the model doesn’t “want” to produce JSON, then you will run into problems that JSON mode can’t solve. It’s a bit of a brute force method.
The first step is to get satisfactory JSON without using JSON mode. Then, if you want, you can add it to ensure that the output is valid.
However, I’d recommend taking an approach where the model “reasons” the contents first, and then creates it.
Thanks for the response. It’s a JSON object that contains quiz data - i.e. question and 4 multiple choice answers.
Oddly, the response starts off looking like a JSON object, then part way through it just goes wild and fills it full of that whitespace.
I’ve noticed that if I don’t force it to give a JSON response, it’ll still give me a text string that looks like a JSON object (but I can’t use that in my app as it’s detected as a string).
You do not need to use JSON mode specifically for functions - that is a bad idea because the AI will only call functions when they are needed to satisfy user input.
Talking about functions in a system prompt is also a good way to distort their output.
Beyond that, it is the quality of the function’s descriptions and key names and specification that will allow it to be used properly by the AI. You are using actual API “functions” with correct specification, right?
You can place a description with each function main description, “only accepts validated JSON beginning with {”
I also have a GPT that I use to always write a “function” to pass onto an API to display some cool stuff.
I would highly recommend turning off both function calling and JSON mode to begin with, and instead feeding it with prompt examples. There’s a few reasons why:
If you always expect JSON you should expect bad/irrelevant text and use this in your examples to work around it. In my case it’s a simple { invalid: true }
It opens the possibility of first “reasoning” the JSON object instead of diving in.
If the object is bad, or simply doesn’t exist you can just try again. It’s very easy to validate and in my experience has not been an issue.
I have been using this concept since gpt-3.5 and haven’t had any issues. Being brutally fair, I have not tried to transfer it (if it works…) but I do use function calling for other GPTs when I would like the model to determine the appropriate time to call it.
Once you have some good results I think it’s time to hook it all up if necessary