Ensure JSON response format

ScottFennell · March 9, 2023, 1:51pm

Hello,

I often want the API to send back a response that is something I can parse programmatically. For example, I want it to summarize a passage of text, providing comments, where each comment is a member of a JSON array.

It’s frustrating. Something the response will have “JSON:” prepended to it. Sometimes it won’t. Sometimes there will be some other sort of preamble to explain that it’s JSON.

A pretty common feature of API’s is to be able to specify a response format. It would make things much more usable for developers here.

EricGT · March 9, 2023, 1:56pm

Have you tried putting in the prompt to respond with certain parts as JSON? For a similar problem with math expressions, asking for LaTeX is working fairly well. (ref)

It also seems that the more ChatGPT is asked to use LaTeX for math expressions the more it seems to know this, does it automatically and gets it correct, but that could be my wishful thinking.

ScottFennell · March 9, 2023, 1:57pm

Yes, I tell it to format that response as JSON.

EricGT · March 9, 2023, 1:59pm

Have you tried adding a few-shot examples?

wfhbrian · March 9, 2023, 1:59pm

Some things you might try:

Provide your answer in JSON form. Reply with only the answer in JSON form and include no other commentary:

Another thing you might try adding to the very end of your prompt is \n```json to indicate the beginning of a JSON object in markdown format.

mkumar · March 10, 2023, 5:44pm

Telling the model what to do(prompting) only 60% success rate or poor reliability. Give a few examples of your output style, you improve it to 90%. If you need 100% reliability in achieving JSON output and wish to parse it error-free, use 2 API calls, first one gets the text, and 2nd call formats the text in JSON format. If you get any parse error, you loop the generated text back into the 2nd API call the loop should only end if there is no error in parsing the JSON. This is one way I have used to ensure reliability when you are parsing the text to the front end and to avoid errors.

anon10827405 · March 10, 2023, 7:30pm

As mentioned, a few shot example works wonders. Depending on how large and nested your JSON object is, you will continuously run into issues.

You would have a better time if you were to train a model, or simplify it with token classification and use that to logically create your object.

ScottFennell · March 11, 2023, 12:32pm

@RonaldGRuckus , that is interesting. I could train my own custom model to understand JSON output, and then use that model going forward. Is that possible with the chat endpoint? Seems like it’s not, or not yet:

https://platform.openai.com/docs/api-reference/chat/create#chat/create-model

What do you mean by this:

simplify it with token classification

ruby_coder · March 12, 2023, 5:15am

The response is actually in JSON format, it just so happens that the text of the response is a text JSON value, so the completion response is a JSON text value.

I think it is more reliable to parse this text value and convert to JSON after the completion versus trying to prompt am stochastic language model to always output reliable JSON data.

FWIW,

ScottFennell · March 12, 2023, 1:36pm

I think you may be right. I’ve had more consistent luck asking it to respond in a numerical list, with each list item on it’s own line. That’s easy enough to parse into JSON with a little string hackery.

EricGT · April 4, 2023, 10:57am

For those who may have noticed the earlier link for this was broken, it has now been corrected.

skwirrel · April 17, 2023, 12:34am

I have had reasonable success with asking ChatGPT to create/improve the questions that you ask it e.g. “Can you modify the question I asked you so that next time I ask it you will…” then, for example “… provide the answer in JSON format”. You can also ask chatGPT to compress the question. Here is an example I achieved using this technique. I wanted ChatGPT to provide topics and weightings for any input phrase I gave it, with the answer in the form of a JSON object. The resulting question was…

: Yesteday I bought myself a teapot. : Topics/keywords/people/places/things/dates/times. : 10+ topics. Output: JSON hash {topic: % weight} only. NoExplanation

The bit about “buying a teapot” is an example of the phrase I want it to categorize. The output I get is consistently just pure JSON e.g.
{
“teapot”: 0.8,
“yesterday”: 0.4,
“buying”: 0.3,
“self”: 0.3,
“household items”: 0.2,
“kitchenware”: 0.2,
“shopping”: 0.2,
“personal purchase”: 0.2,
“home goods”: 0.1,
“retail”: 0.1,
“April 16, 2023”: 0.1
}

IvanIsCoding · April 18, 2023, 1:06am

I personally found few-shot examples and English description of the schema to be unreliable for my use case. Specifying the schema as a TypeScript interface finally did the trick for me. You can see some examples at ResuLLMe/__init__.py at main · IvanIsCoding/ResuLLMe · GitHub

In general, I recommend the following prompt style:

You are going to write a JSON <insert the rest of your instructions>

Now consider the following TypeScript Interface for the JSON schema:

interface MessageItem {
    receiver: string;
    sender: string;
    content: string;
}

interface Message {
   messages: MessageItem[]; 
}

Write the basics section according to the Message schema. 
On the response, include only the JSON.

bel423 · April 18, 2023, 1:59am

Typescript is the way to go. I also find using zod works really well too because you can validate the outputs at runtime and if there is an error feed it back in and let GPT reflect on it’s mistake.

ScottFennell · April 24, 2023, 3:25pm

How are you expressing line breaks, in your code, when providing your prompt? Currently I just have “\r\n” inside of a string and it does seem to work. But it’s confusing because obviously if I just log the request, I don’t see a new line, I see the literal “\r\n”.

Logan2 · July 18, 2023, 2:15pm

You can try my system prompt: GitHub - rlogank/chatgpt-controller

tuannguyenquocios · July 27, 2023, 7:34am

Thank you. This guide really works for me.

stevejamesdev · October 29, 2023, 8:44am

I use this technique regularly

You just need to ask it.

I use a prompt along the lines of

# How to respond to this prompt
- Your response MUST be a JSON array
- No other text, just the JSON array please

Youll probably want to strip out ```json header and ``` footer before parsing the response

GPT3.5 particularly struggles to not throw these MD bits in

But beyond that… it generally just works fine

P.S. I also find adding an example response can help a lot if the structure you’re looking for is complex

emcie4 · October 30, 2023, 3:22am

I always put the requirements in both the system and user messages.

kjordan · October 30, 2023, 3:25am

From my personal experience, the stability of outputing JSON array is worse than JSON object, so instead of ["a","b","c"], I always ask the model to output {"0": "a", "1": "b", "2": "c"}, which has a better stability.

Topic		Replies	Views
Fine tuning models to generate JSON response Prompting codex , chatgpt , fine-tuning , api	6	6022	November 9, 2023
Response has valid json but it's nested in broken json Bugs	16	3769	September 9, 2024
Valid json every time? Prompting	17	11930	January 3, 2024
HTML / JSON / Markdown Output Generation is Very Clunky or out right broken API api , html , json	14	7940	December 1, 2023
I want to get json format response which I can pass using gpt-4 model. Also I want to give my prompt to get json data Prompting gpt-4	14	20741	October 26, 2023

Ensure JSON response format

Related topics