There’s two things to specify for your function parameters, which can be added as a system prompt, or the description of function or parameter to correct an error regularly produced:
“JSON framework must be constructed with double-quotes. Double quotes within strings must be escaped with backslash, single quotes within strings will not be escaped.”
I am not using functions. Its a bit of alchemy. I have played it around more.
In my prompt, I have something like this
“”"
The example of sample out put look like this {response_format}
“”"
The response_format loads from a json file which has all the name and values in double quotes in json
If I remove it I get double quotes, but a structure that is not to my liking. If I keep it, I get single quote in json
The model needs to see an example of the JSON you want back and that JSON should use double quotes. I would recommend only asking the model to return strings as it gets a little dicey if you ask it to return a mix of types likes numbers and bools.
Prompt used above, so you don’t have to retype to fix all your JSON woes:
# Instruction
You are a helpful but brief answering assistant for eight-year-old children, generating up to 40 word output only in JSON for our web site's python backend.
# Output format
JSON, keys and values require double-quotes
JSON keys: ["response", "topic grade level", "too_hard"]
JSON value type: [escaped string, int, truth value]
For a more complex json structure, your inner array, I had to inject a “[AI: JSON output format]” post-prompt to get it to not completely ignore the JSON instructions and just produce regular text. That solved, the system message works as anticipated:
# AI Rules
- Output only described JSON output format compatible with python backend.
- JSON, keys and values require double-quotes
# Role
You are a helpful and brief answering assistant for eight-year-old children. Generate up to 40 word response.
# JSON output format
JSON keys: ["user input", "response", "grade_report"]
JSON value type: [escaped string, escaped string, JSON subarray 1]
JSON subarray 1: "grade_report"
JSON subarray 1 keys: ["user input estimated grade level", "AI output estimated grade level", "grade level required to answer", "question exceeds target level"]
JSON subarray 1 value type: [int, int, int, truth bool]
Sorry, not trying to be argumentative… the thing to understand is that these models are fundamentally pattern recognizers. They will always favor following patterns over instructions. To that end, showing them a pattern of the JSON to return will always out perform a schema definition of the JSON to return.
Showing them schema definitions CAN work but it’s not perfect. Honestly, nothing is perfect. The models are probabilistic so there’s always going to be a chance of error. I feel like I’ve gotten fairly close to making them as reliable as you can by confronting them when they make a mistake and simply asking them to correct their error. This works surprisingly well.
True, and any and all techniques combined improve the quality of output that is now always on a knife’s edge of breaking.
However, the massive instruction-following training degrades the completion-example style multi-shot ability – and then they reduced instruction-following (disobeying jailbreakers by example).
My JSON above, we can put a few example user/assistant roles to improve its understanding of the actual JSON values expected, so it doesn’t just get the format right, it knows how to answer the questions inside also.
One now must be careful not to bias GPT with the examples given, because it sees the system prompt through a very blurry window: A 20-questions game with examples in the system prompt? It’s going to only ask those exact questions until you make them preposterous, like “does it live on the moon?”
Counterpoint: Davinci-003, (from the days of actual prompt engineering, where pure completion is purely example and leading it on), I show the quality to be destroyed in six months, where AI gets it right in one example (unlike turbo, its text length also correct):
Our website’s API AI answers like this:
Human: What is a math modulus function?
AI:
{
“user input”: “What is a math modulus function?”,
“response”: “A modulus function in math gives the remainder when one number is divided by another. For example, 7 divided by 3 is 2 with a remainder of 1, so the modulus of 7 and 3 is 1.”,
“grade_report”: {
“user input estimated grade level”: 4,
“AI output estimated grade level”: 6,
“grade level required to answer”: 5,
“question exceeds target level”: false
}
}
Human: What the relationship between e and log? > AI: > { > “user input”: “What is the relationship between e and log?”, > “response”: “The natural logarithm of a number is the power to which the base e (2.71828…) must be raised to equal that number. In other words, the natural logarithm of a number is the inverse of the exponential function of that number.”, > “grade_report”: { > “user input estimated grade level”: 4, > “AI output estimated grade level”: 8, > “grade level required to answer”: 7, > “question exceeds target level”: false > } > }