How to get 100% valid JSON answers?

Hello everyone,

I’m currently working on a project where I need to ensure that I receive 100% valid JSON responses, especially since we’re utilizing the GPT-4 API. It’s critical for our application that the JSON output is consistently reliable and error-free.

Could anyone here advise on best practices or methods to guarantee completely valid JSON responses, particularly from the GPT-4 API? Are there specific tools or techniques I should be using to validate these responses effectively?

Thank you in advance for your help and insights!

well, you can run a bunch of API calls in parallel, and select one output that passes your validator.

1 Like
  1. Instead of aiming for perfection, aim for 99% and then use proper error handling with feedback loops.

  2. Give few-shot examples of good JSON objects. Be careful with these examples though because they will effect the outcomes

  3. Keep it shallow

  4. There is a JSON mode but it can still fail and I wouldn’t recommend using it.

There’s a lot more advice but it depends on your use-case.

3 Likes

You might get 99.99999% performance with traditional deterministic code, but there are likely edge cases that will reduce that number down to 99.99% or lower.

With a statistically influenced model such as an LLM, there is no way to ensure that. There are no systems that offer single attempt perfection, you can approximate it with repetition and proper error checking and result testing.

2 Likes

Why nobody mentioned "response_format": { "type": "json_object" } ? :slight_smile:

4 Likes

It’s still only mostly correct :smiley:

(More words)

1 Like

But much more stable than without it (especially for 3.5) and with generation of say 5-10 variations you can make it almost bullet proof :slight_smile:

1 Like

I am “100%” with you on that :smile: but in a production environment you need to wrap even the most certain of calls in a try/except and hopefully also an output data check.

1 Like

With 3.5 and before we had response format in JSON, I would include in the instructions to always return JSON, give it an example return format and, most importantly, tell it: “do not include any explanation”.

That would give me bare JSON every time. 100%, but I wasn’t running 1000s of requests, so I suppose it wasn’t actually guaranteed.

2 Likes

This is it exactly, I still have systems that I have not updated to use the json mode, they work extremely reliably, but once in 5-20k calls, one will fail with some random missing } or a misplaced , most of it can be caught with checks, but it’s a bad habit to get into when you omit error catching.

1 Like

“I didn’t see any cars”

  • Says every person that gets hit by a car
3 Likes

I know few methods after recent trial:

  1. llama.cpp:
    Recent month has seen their practice in formatted stream output with some framework by re limitation or something.
  2. postprocessiong:
    With hard code or even with recalling llm api, such methods have shown in langchain framework
  3. Certain prompt:
    LLM is sensitive to prompt, so prompts may work in some scene but not in others
1 Like

I have the same situation where I am forcing it to return JSON object, with varying lengths. What seems to have worked for me is doing a bit of everything: JSON response format, mentioning it in the system as well as the prompt, and giving it a example of format. Important to not that giving a example of output would tweak my results so I’m only passing format like this:

---BEGIN FORMAT TEMPLATE---
{'${BEHAVIOR}': '${REASONING}',
'${BEHAVIOR}': '${REASONING}',
'${BEHAVIOR}': '${REASONING}'
}
---END FORMAT TEMPLATE---

Wanted to recommend on Promptotype for testing and validating this use case.
It’s a platform for development, testing, and monitoring of JSON output prompts, allowing you to define collections of queries with expected output json schema (and values if needed), and verify that your prompt and model configuration perform as expected.
*I’m the creator so feel free to reach out with questions or feedback.

1 Like

I have never had function calling return an invalid response. I use it for all json responses even if they’re not functions.

2 Likes

Over time, I have faced some common issues with the JSON responses from OpenAI. Though they are not very frequent, they can still occur. Often, the responses can be malformed with missing brackets, extra words like json, or using single quotes instead of double quotes.

I have written this small utility which tries to address these discrepancies with post-processing.

You can find the project on GitHub:

:link: GPT JSON Sanitizer

Feel free to check it out, and let me know of more cases, as I know there will be more, and we can together extend it further.

I’m trying to get a response from the chat completion similar to this (list of JSON objects):

messages=[
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "Hello!"}
  ]

I have tried with this approach:

However, when using the response format in JSON I just get the first object of the list. In this case

{“role”: “system”, “content”: “You are a helpful assistant.”}

I obtain the best results when not using the ‘response format in JSON’ parameter, but then it works just 2/5 times. I have tried to recall the API using the wrong response as input, but it’s not working all times. I have also tried working with two models, one to generate data, the second to convert to JSON, this is also not working 100%.

What is the best approach to get your JSON response right from the first step?? (in my case a list of JSON objects )