I’m currently working on a project where I need to ensure that I receive 100% valid JSON responses, especially since we’re utilizing the GPT-4 API. It’s critical for our application that the JSON output is consistently reliable and error-free.
Could anyone here advise on best practices or methods to guarantee completely valid JSON responses, particularly from the GPT-4 API? Are there specific tools or techniques I should be using to validate these responses effectively?
You might get 99.99999% performance with traditional deterministic code, but there are likely edge cases that will reduce that number down to 99.99% or lower.
With a statistically influenced model such as an LLM, there is no way to ensure that. There are no systems that offer single attempt perfection, you can approximate it with repetition and proper error checking and result testing.
I am “100%” with you on that but in a production environment you need to wrap even the most certain of calls in a try/except and hopefully also an output data check.
With 3.5 and before we had response format in JSON, I would include in the instructions to always return JSON, give it an example return format and, most importantly, tell it: “do not include any explanation”.
That would give me bare JSON every time. 100%, but I wasn’t running 1000s of requests, so I suppose it wasn’t actually guaranteed.
This is it exactly, I still have systems that I have not updated to use the json mode, they work extremely reliably, but once in 5-20k calls, one will fail with some random missing } or a misplaced , most of it can be caught with checks, but it’s a bad habit to get into when you omit error catching.
I have the same situation where I am forcing it to return JSON object, with varying lengths. What seems to have worked for me is doing a bit of everything: JSON response format, mentioning it in the system as well as the prompt, and giving it a example of format. Important to not that giving a example of output would tweak my results so I’m only passing format like this:
---BEGIN FORMAT TEMPLATE---
{'${BEHAVIOR}': '${REASONING}',
'${BEHAVIOR}': '${REASONING}',
'${BEHAVIOR}': '${REASONING}'
}
---END FORMAT TEMPLATE---
Over time, I have faced some common issues with the JSON responses from OpenAI. Though they are not very frequent, they can still occur. Often, the responses can be malformed with missing brackets, extra words like json, or using single quotes instead of double quotes.
I have written this small utility which tries to address these discrepancies with post-processing.
I’m trying to get a response from the chat completion similar to this (list of JSON objects):
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Hello!"}
]
I have tried with this approach:
However, when using the response format in JSON I just get the first object of the list. In this case
{“role”: “system”, “content”: “You are a helpful assistant.”}
I obtain the best results when not using the ‘response format in JSON’ parameter, but then it works just 2/5 times. I have tried to recall the API using the wrong response as input, but it’s not working all times. I have also tried working with two models, one to generate data, the second to convert to JSON, this is also not working 100%.
What is the best approach to get your JSON response right from the first step?? (in my case a list of JSON objects )