Inconsistent and invalid JSON response

I have been experimenting with openAi and constructed a test the provide some context and request questions to be generated. The result JSON provides the requested text in the ‘choices/text’ item, but the string value is inconsistent and sometimes garbled, even for the same input run over and over.

My prompt is as follows:

$prompt ='
Given the article below, create a JSON object which enumerates a set of 5 child objects.  
{snipped}                     
   Each child object has a property named "retQues" and a property named "retAns" and a property named "mcOptA" and a property named "mcOptB" and a property named "mcOptC" and a property named "mcOptD".
   The resulting JSON object should be in this format: [{"retQues":"string","retAns":"string","mcOptA":"string","mcOptB":"string","mcOptC":"string","mcOptD":"string"}]}].\n\n
   The article:\n
   ' . $article . '\n\n
';

$article can be any text of a few paragraphs.

What I get back contains the following errors:

  • Smart quotes instead of normal double quotes surrounding some of the JSON keys or values.
  • JSON keys split into two eg “mc OptC”.
  • JSON keys in the wrong case eg: “mcoptc”
  • Missing colons or double quotes eg: " key: " The value", or "key": The value"

I use a series of PHP preg_replace calls to clean up the response, but the responses keeps changing, so I cannot get reliable consistency. I using the text-davinci-003 engine.

Oddly/encouragingly, the first 2 or 3 { "retQues": ...} sets are perfectly fine, but after the 3rd things deteriorate. At one point I was getting odd non-text characters between interspersed in the output (again after the 3rd set), but the above prompt fixed that.

I must be doing something wrong, but what is it?

2 Likes

I’m having the same issue and finding it weird there is no solution for it yet. I’ve spend almost whole day to convince GPT to reply me a valid JSON but no luck. Which is making it very hard to build an app on it…

Have you found a solution?

Thank for for replying. Unfortunately I have not found a solution as yet. I even tried requesting XML output. Adding more and more preg_replace statements to clean up the output is not 100% reliable.

I even started to get the non-text characters re-appearing, which suggest the code that generates the JSON responses are maybe serialising the JSON string , but I’m guessing.

Maybe try putting the json examples after the article rather than the beginning of the prompt?
And use more realistic variables names instead of “mcOptC”, maybe thats nuding it into using random letters

If all else fails you could just add another call: “the following JSON is broken, please fix it”

“Which is making it very hard to build an app on it” - thought the same. Don’t know how the apps in the market are handling it? It seems like I need to code for 5-10 possible error cases in JSON format to improve reliability. But still, it will not guarantee success, the model will invent new ways of formatting the JSON and messes up the structure. The prompt cannot be the only solution for making the model output consistently. I know Finetuning is one option, i tried but still it fails 5-10% to times and the cost is so high compared with the ChatGPT 3.5 API

Contrary to rumors on the net, if a developer manages their array of messages for the chat API completion properly, working with the system role is not a problem, and works fine in the “front” or the “back” of the param messages array.

See, for example:

Just chiming in to let you know I’m getting the same behaviour. No matter how strictly I try to prompt it, it’s just inconsistent.

I created a prompt asking as a result a list in JSON format, no matter the attemps I try it is inconsistent. Using models that are supposed to return JSON format does not work as it does not return a list, just the first element.
Using regular models that return text and asking as much detailed and with examples I could it returns JSON with error syntax (missing ‘,’ for example in the list of items) which makes impossible to have consistent processing

It seems really not suitable to make automatic processing.