Often you need the output of the LLM to be in json format. Even when you include phrases such as “the output must be in JSON format” or variations thereof, the LLM sometimes outputs something like “here is the output in json” or “```json”.
This makes the task of downstream parsing of the output of LLM fail (ie, not valid json).
A simple trick to increase the likelihood of LLM creating a pure json as output is to use the “logic_bias” parameter of openAI API. This parameter is a dictionary that specifies a bias (-100 means not likely and 100 means very likely) a token_id associated with characters to which you want to make more or less likely to appear in the output.
For example, the following increases the probability of “{” and “}” and decrease the probability of ``` or ‘’’ in the output.
logit_bias: {
"90": 10, // token ID for "{"
"92": 10, // token ID for "}"
"19317": -10, // token ID for "'''"
"19317": -10, // token ID for "'''"
"74694": -10 // token ID for "```"
}
MANDATORY: Every single response produced after the "assistant" prompt must begin with the exact characters shown within the single-quotes: '{"function_name": "'
MANDATORY: Every single response then continues with valid JSON output complying to the included JSON schema, and will be validated, allowing no deviation.
Note: There is no user to communicate with directly. AI JSON output response is provided directly to an external API interface backend.
// JSON Schema for assistant response
(real json schema as you’d validate AI output with…)
Using that as the very last line the chat model sees tends to yield pretty good results. After that, it’s mostly smooth sailing (apart from the apostrophe issue, which can be overcome by using the right schema hint)
Now, of course, if our dear AI providers allowed us to take off the training wheels with these chat models, we’d have a lot more latitude.
Yes, other competitive AI (ahem) allows assistant prompt completion, where you could place that JSON opening and a key directly after the internal “assistant” that is automatically added. It works a treat to write the start of the JSON for -instruct on OpenAI completions, set the additional stop sequence to the end of JSON, and pay output rate for only the varying internal content. Just need someone at OpenAI go give ChatML that, and to not treat the developer as an untrustworthy child like the assistants endpoint does.