Ensure JSON response format

Hello,

I often want the API to send back a response that is something I can parse programmatically. For example, I want it to summarize a passage of text, providing comments, where each comment is a member of a JSON array.

It’s frustrating. Something the response will have “JSON:” prepended to it. Sometimes it won’t. Sometimes there will be some other sort of preamble to explain that it’s JSON.

A pretty common feature of API’s is to be able to specify a response format. It would make things much more usable for developers here.

4 Likes

Have you tried putting in the prompt to respond with certain parts as JSON? For a similar problem with math expressions, asking for LaTeX is working fairly well. (ref)

It also seems that the more ChatGPT is asked to use LaTeX for math expressions the more it seems to know this, does it automatically and gets it correct, but that could be my wishful thinking. :slightly_smiling_face:

1 Like

Yes, I tell it to format that response as JSON.

:slightly_frowning_face:

Have you tried adding a few-shot examples?

Some things you might try:

Provide your answer in JSON form. Reply with only the answer in JSON form and include no other commentary:

Another thing you might try adding to the very end of your prompt is \n```json to indicate the beginning of a JSON object in markdown format.

1 Like

Telling the model what to do(prompting) only 60% success rate or poor reliability. Give a few examples of your output style, you improve it to 90%. If you need 100% reliability in achieving JSON output and wish to parse it error-free, use 2 API calls, first one gets the text, and 2nd call formats the text in JSON format. If you get any parse error, you loop the generated text back into the 2nd API call the loop should only end if there is no error in parsing the JSON. This is one way I have used to ensure reliability when you are parsing the text to the front end and to avoid errors.

As mentioned, a few shot example works wonders. Depending on how large and nested your JSON object is, you will continuously run into issues.

You would have a better time if you were to train a model, or simplify it with token classification and use that to logically create your object.

@RonaldGRuckus , that is interesting. I could train my own custom model to understand JSON output, and then use that model going forward. Is that possible with the chat endpoint? Seems like it’s not, or not yet:

What do you mean by this:

simplify it with token classification

GPT is really good at creative output. By design even with a 0 temperature it can have different outputs from the same prompt. Because of that it can be a bad idea to rely on it for NLP → JSON, but in my experience it is still very good at it up to a certain point.

My issue is that this certain point isn’t good enough and I can’t control it or tune it.

Token classification would work better. Instead of relying on a zero-shot, or even few-shot to build your JSON object, you would simply identify the items by category, and then use that to logically build your JSON object.

Is an example of a token classifier for the medical industry.

You could even start with a zero-shot classification on checkpoints to see what would start best with your gameplan:

The response is actually in JSON format, it just so happens that the text of the response is a text JSON value, so the completion response is a JSON text value.

I think it is more reliable to parse this text value and convert to JSON after the completion versus trying to prompt am stochastic language model to always output reliable JSON data.

FWIW,

:slight_smile:

I think you may be right. I’ve had more consistent luck asking it to respond in a numerical list, with each list item on it’s own line. That’s easy enough to parse into JSON with a little string hackery.

1 Like