Clarity on "Optional" Parameters in Structured Outputs

Hoping someone can help here. In the documentation the following is provided as a means by which we can simulate optional parameters:

{
    "name": "get_weather",
    "description": "Fetches the weather in the given location",
    "strict": true,
    "parameters": {
        "type": "object",
        "properties": {
            "location": {
                "type": "string",
                "description": "The location to get the weather for"
            },
            "unit": {
                // highlight-start
                "type": ["string", "null"],
                // highlight-end
                "description": "The unit to return the temperature in",
                "enum": ["F", "C"]
            }
        },
        "additionalProperties": false,
        "required": [
            "location", "unit"
        ]
    }
}

My assumption was that this would mean unit would always be part of the response, but if the model could not identify one that it would return null.

With gpt-4o-mini this approach is sometimes dropping that key altogether in the response.

Does this null approach drop the key or does it guarantee that it will be either F, C, or null?

If the latter, its not functioning as advertised for gpt-4o-mini (streaming function call with Assistants NOT completion in the Playground…)

Maybe add confidence score and a reasoning as extra fields to push the model to “think” less lazy?

Did someone manage to deal with this problem?

There’s no problem, I don’t think, to answer your survey.

A union of a data type and a null allows the AI to write a JSON with either value.

There’s multitudes of places in code for a null value returned in a JSON to have its key dropped. I expect this to be consistent and deterministic. In a structured response using strict, the AI cannot deviate from producing all the keys, unless there is a bug in writing or enforcing the context-free grammar.