Getting response data as a fixed & Consistent JSON response

Yes I have the same issue in English. Not too often, maybe 2-5% of all my objects return broken. Depending on the structure of information that it’s reading.

It can also add its own information, such as extra nested objects / arrays. Like you said, there’s definitely uses for it though; I’m happy with the results and is still plenty worth the cleanup.

In your case, have you tried the same in English? I haven’t practiced too much in other languages, but I do notice slightly different answers in Spanish.

1 Like

I’ll try to request in english if it gives us better results. But anyway we need to traduce json values in french then, so this 'll lead other issues on traduction process.

In my case, in the context of my requests, it’s about 15% of JSON arrays produced which are totally broken. I request array of objects containing a date, multiline text, place, subject etc. It’s “just” a simple array of objects ({key: string}).

By request in English I mean to something like:

Extract and categorize the information below using the following JSON structure:
<|ENDSEG|>
{
“data”: [ {“nom”: “”, “âge”: 0} , {“nom”: “”, “âge”: 0} ]
}
<|ENDSEG|>
Info: [INFO]
<|ENDSEG|>
{
“data”: [{ “nom”: [END]

Not actually converting the results from french to english. I really don’t think it’ll help much, but it’s worth a shot. May return some interesting results that I’d love to know myself.

I just made a simple test in english and I had 1/4 ugly broken JSON.

For example, here is my prompt :

"Create a valid JSON array of objects to list famous french books :
[{
\"title\": \"Book title\", 
\"release_date\": 
\"Release date in France, formatted as YYYY-MM\", 
\"subject\": \"Book subject\", 
\"characters\": \"3 characters separated with linebreaks and tirets\"
}]
The JSON object is:"

Result:
Screenshot from 2023-02-26 23-39-36

The result is often broken because of “release_date” format, bad-quotes,

Then I made a more complcated request :

"Create a valid JSON array of objects to list famous french books :
[{
\"title\": \"Book title\", 
\"release_date\": \"Release date in France, formatted as YYYY-MM\", 
\"subject\": \"Book subject\", 
\"location\": \"Place where lived the author\", 
\"characters\": \"3 characters and their characteristics, separated with linebreaks and tirets\"
}]

The JSON object is:"

The result becomes messy with json property keys without quotes etc.

I’ll tell you my results in english , not sure if we gonna test but I’ll keep you in touch.

2 Likes

Thanks for trying. Are your escape characters used for the prompting? Where did they come from?

Also, have you tried without setting defaults such as | Title: Book Title | ? In my experience it’s better to only express the datatype and nothing else.

But yes, the issue that you’re running into (Something being formatted incorrectly, like a datetime being formatted as a integer) is something I’ve noticed as well.

I think it’s fair to always accept a less than perfect result from a language model as it’s not based on simple logic.

So the question is : is it better to use it, and clean the result, or find another more reliable method? Which comes back to your point. Depends on the situation.

Hi @stouch

I tested your prompt above and got a valid JSON completion:

I slightly changed your prompt @stouch and got a valid JSON completion for your more “complcated request”.

In my test prompt, I instructed the API to not escape double quotes in the output, like this: Do not escape the double quotes in the output:

Create a valid JSON array of objects to list famous french books :
[{
\"title\": \"Book title\", 
\"release_date\": \"Release date in France, formatted as YYYY-MM\", 
\"subject\": \"Book subject\", 
\"location\": \"Place where lived the author\", 
\"characters\": \"3 characters and their characteristics, separated with linebreaks and tirets\"
}]

Do not escape the double quotes in the output: The JSON object is:

Hope this helps

:slight_smile:

1 Like

This was my full payload :

{
	"model": "text-davinci-003",
	"prompt": "Create a valid JSON array of objects to list famous french books :\n[{\"title\": \"Book title\", \"release_date\": \"Release date in France, formatted as YYYY-MM\", \"subject\": \"Book subject\", \"location\": \"Place where lived the author\", \"characters\": \"3 characters and their characteristics, separated with linebreaks and tirets\"}]\nThe JSON object:",
  "max_tokens": 600,
  "temperature": 0.2,
	"top_p": 1,
	"frequency_penalty": 1,
	"presence_penalty": 0
}

FTR : I updated the regexes that fix our french output in 9/10 of our cases : chatgpt-json-cleaner/index.php at main · stouch/chatgpt-json-cleaner · GitHub

That is definitely not helping you!

3 Likes

Indeed, this parameter impacts negatively JSON structure output… We’ll try to avoid to use it. Thanks.

I’d be very interested in knowing why it does impact JSON structure actually ? PaulBellow

I facing same issue with chatGPT’s API. it gives unwanted string in between the JSON response. therefore i not able to process the json response sometimes!

Noob Question Alert!
So in ChatGPT API, the prompt "messages": [{"role": "user", "content": "Hello!"}] in this format. So within the “content” how can I give multiple valid JSON key-value pairs? like in your above example? This solution might work well for regular completion API, but for the ChatGPT 3.5 model how to show the model with a JSON example to make it consistently output valid JSON?

[
   {"role": "user", "content": "Hello!"},
   {"role": "user", "content": "Hello Again!"},
   {"role": "user", "content": "Hello Three Times!"},
   {"role": "user", "content": "Hello Forever!"},
]

HTH

:slight_smile:

I have tweaked my preferred prompt to give a better response, and specifically say to provide RFC8259 compliant JSON. I have had consistent results using Chat-GPT as with Davinci, although for Chat-GPT I had to also add the instruction not to provide an explanation to consistently only get the JSON without any pre-amble;

system prompt:

Pretend you are an expert language translator

user prompt:

Create a list of three random source phrases and three random translations for each.
Do not include any explanations, only provide a  RFC8259 compliant JSON response  following this format without deviation.
[{
  "source_language": "language of original phrase",
  "source_phrase": "the phrase to be translated",
  "translations": [{
    "trans_language": "language of the translation",
    "translation": "the translated phrase"
  }]
}]
The JSON response:

This gives a consistent JSON response along the lines of…

[{
  "source_language": "English",
  "source_phrase": "I love pizza",
  "translations": [{
    "trans_language": "Spanish",
    "translation": "Me encanta la pizza"
  },{
    "trans_language": "Italian",
    "translation": "Amo la pizza"
  },{
    "trans_language": "French",
    "translation": "J'adore la pizza"
  }]
},{
  "source_language": "Mandarin",
  "source_phrase": "你好吗?",
  "translations": [{
    "trans_language": "English",
    "translation": "How are you?"
  },{
    "trans_language": "Korean",
    "translation": "어떻게 지내?"
  },{
    "trans_language": "Japanese",
    "translation": "お元気ですか?"
  }]
},{
  "source_language": "Russian",
  "source_phrase": "Я люблю мороженое",
  "translations": [{
    "trans_language": "German",
    "translation": "Ich liebe Eis"
  },{
    "trans_language": "Turkish",
    "translation": "Dondurma seviyorum"
  },{
    "trans_language": "Polish",
    "translation": "Kocham lody"
  }]
}]
5 Likes

This worked great for me ! Consistently receiving JSON response.

1 Like

Finally, an approach that seems to work, thank you!
Roy

I would like to hear if anyone has experiences in placing instructions in system role content with chatCompletions? My experiences are not too good, so far

I just figured out another prompt. My inputs are different, though, and to be categorized “high”, “medium”, “small”, “extra_small”.

My prompt (relevant part in bold): “Without any comment, return the result in the following JSON format {“high”:[…],“medium”:[…],“small”:[…],“extra_small”:[…]}”

my app are crucial with a pre-formatted JSON Structure.
which contain not only “reply in text”
but also various system command, and args.

My tips for any1 for need this:

THE POSITION of this JSON Only instruction is the MAIN FACTOR how consistant the GPT will follow.

– As long As this particular instruct is the VERY LAST Part of the entire prompt. you are good to go.
– i placing this just under User input ( as reminder )

This work best for me.

I thought this approach from Microsoft was really interesting: GitHub - microsoft/guidance: A guidance language for controlling large language models.

1 Like

Hey, we were having this problem as well. The way we solved was by adding this “reply in JSON format” in every interaction we had with ChatGPT, not only in the prompt. It seems to be working