I thought I’d show something interesting:
strict: false
is barely better than just instructing the AI about JSON, which you can do more verbosely;
- a non-strict schema for
response_format
is placed for the AI understanding the same way, a mere injection of what was provided;
- the way schemas are placed makes us think that OpenAI has private or future use of multiple response output types for their own model usage.
So: I replicate, trick, and expand on the system message placement of a schema. Then see how it performs by the AI’s understanding and any proprietary training done on following schemas.
import json; from openai import Client; client = Client()
system_message = """You are a helpful assistant.
Prefer indexed JSON object output responses.
# Responses
## multi_item_response
{
"type": "object",
"title": "Multi-item Responses"
"description": "JSON responses with multiple indexed items for any lists as output",
"patternProperties": {
"^[0-9]+$": {
"type": "string"
}
},
"additionalProperties": false
}
## single_item_response
{
"type": "object",
"title": "Single-item text responses"
"description": "Produce a single-item or direct response to user",
"properties": {
"item": {
"type": "string",
"description": "The content of the response item."
}
},
"required": [
"item"
],
"additionalProperties": false
}"""
response = client.chat.completions.create(
model="gpt-4o",
messages=[
{"role": "system", "content": system_message},
{"role": "user", "content": "List four good xmas presents for mom."}
],
response_format={"type": "json_object"}
)
content = response.choices[0].message.content
try:
print(json.dumps(json.loads(content), indent=2))
except:
print(f"The response failed JSON-parsing. Response:\n{content}")
In my system message, you see how the schemas are just plonked there with no guidance. That’s the way they are injected, but they are also minified.
Response:
{
"1": "A personalized piece of jewelry, like a necklace with her initials or birthstones.",
"2": "A spa day gift certificate for some relaxation and pampering.",
"3": "A high-quality scented candle or a set of aromatherapy oils.",
"4": "A custom photo album or framed family photo to cherish memories."
}
What have I done? I’ve not only placed optional schemas right into the system message instead of convoluted and nested anyOf constructions, but I also gave the AI an open-ended schema with unsupported keyword patternProperties
and no keys.
“JSON mode” is used so the AI doesn’t try to wrap in markdown or other non-json output.
That output is formatted by JSON library that would fail if not valid.
Thus, achievement unlocked.
The only difference is that with json_object
to activate enforced structured output, the AI for some reason must output the name of the second level heading, even though you have no options and the “strict” AI cannot deviate from this. This heading is the “json_schema”->“name” that is not actually the response schema followed. 10 token schema name = 10 wasted input and output tokens when emitting to the internal response recipient of the API backend. This output behavior is not trained or demonstrated in simulation.