Wrapper for structured outputs with non required fields

From the doc:
https://platform.openai.com/docs/guides/structured-outputs/supported-schemas

Although all fields must be required […], it is possible to emulate an optional parameter by using a union type with null.

Is there a work_around wrapper to use openai structured outputs, with a pydantic format containing non required fields, and get the response without the null fields?

a wrapper that converts something like this:

class Foo(BaseModel):
    count: int
	size: int = None # non required not nullable

  "response_format": {
    "properties": {
      "count": {
        "type": "int"
      },
      "size": {
        "type": "string",
      },
    },
    "required": [
      "count"
    ],
    "type": "object"
  }

into a schema supported but openai, such as:

class Foo(BaseModel):
    count: int
	size: Optional[int]  # required but nullable

  "response_format": {
    "properties": {
      "count": {
        "type": "int"
      },
      "size": {
        "type": ["string", "null"],
      },
    },
    "required": [
      "count",
      "size",
    ],
    "type": "object"
  }

and then, converts the output value back using the original schema.

Hi @david.motxilla !

So in Pydantic, a field either exists or it doesn’t. In this case:

class Foo(BaseModel):
    count: int
    size: int = None

size is a required field, it just has a default value None. So there is nothing to separate those two examples.

But let’s say that you instead have:

class FooRequest(BaseModel):
    count: int
    size: Union[int, None] = None

class FooResponse(BaseModel):
    count: int

then you can pass FooRequest into response_format and then create a little wrapper for the response:

def getFooResponse(request: FooRequest) -> FooResponse:
    return FooResponse(count=request.count)

# assuming here you did your API call and go the response in `r`
print(getFooResponse(FooRequest(**json.loads(r.choices[0].message.content))))

count=2

If that’s what you really want :slight_smile:

Your definition size: Union[int, None] = None creates a schema "anyOf": [{"type": "string"},{"type": "null"}] without size not being required, so openai does not accept it.
Moreover, your FooResponse does not include size, I don’t see any value on this. ???

I guess that the only possibility accepted by openai to have an optional field is the workaround required_but_nullable: Optional[str].

This shows the behavior of describing optional fields, the schema and required or not, and how it dumps optional or `unset’ values.

import json
from pydantic import BaseModel
from typing import Optional
from typing import Union
import pydantic

class Info(BaseModel):
    not_required_and_nullable: Optional[str] = None
    not_required_not_nullable: str = None
    required_but_nullable: Optional[str]
    required_not_nullable: str
    union: Union[str, None] = None

print(json.dumps(Info.model_json_schema(), indent=2))

i1 = Info(
    not_required_and_nullable = "test", 
    not_required_not_nullable = "test", 
    required_but_nullable = "test", 
    required_not_nullable = "test",
    union = "test"
)

print("i1", i1.model_dump_json(indent=2, exclude_unset=True))

i2 = Info(
    # not_required_and_nullable = None, 
    # not_required_not_nullable = "test", 
    required_but_nullable = None, 
    required_not_nullable = "test",
    # union = None
)

# this excludes unset values (not_required_and_nullable, not_required_not_nullable, union), but it does not exclude required_but_nullable
print("i2 exclude unset", i2.model_dump_json(indent=2, exclude_unset=True))

# this one works as workaround, but it might also exclude None values that are set
# print("i2 exclude none", i2.model_dump_json(indent=2, exclude_none=True))

output:

{
  "properties": {
    "not_required_and_nullable": {
      "anyOf": [
        {
          "type": "string"
        },
        {
          "type": "null"
        }
      ],
      "default": null,
      "title": "Not Required And Nullable"
    },
    "not_required_not_nullable": {
      "default": null,
      "title": "Not Required Not Nullable",
      "type": "string"
    },
    "required_but_nullable": {
      "anyOf": [
        {
          "type": "string"
        },
        {
          "type": "null"
        }
      ],
      "title": "Required But Nullable"
    },
    "required_not_nullable": {
      "title": "Required Not Nullable",
      "type": "string"
    },
    "union": {
      "anyOf": [
        {
          "type": "string"
        },
        {
          "type": "null"
        }
      ],
      "default": null,
      "title": "Union"
    }
  },
  "required": [
    "required_but_nullable",
    "required_not_nullable"
  ],
  "title": "Info",
  "type": "object"
}
i1 {
  "not_required_and_nullable": "test",
  "not_required_not_nullable": "test",
  "required_but_nullable": "test",
  "required_not_nullable": "test",
  "union": "test"
}
i2 exclude unset {
  "required_but_nullable": null,
  "required_not_nullable": "test"
}

My first option (and desired option) has a schema of "size": {"type": "string",}, without size beging required. This is not accepted by openai.

class Foo(BaseModel):
    count: int
	size: int = None # non required not nullable

while my second option (workaround option for openai) has a schema of "size": {"type": ["string", "null"],}, with size being required.

class Foo(BaseModel):
    count: int
	size: Optional[int]  # required but nullable

so I am asking for an existing generic workaround function that makes this:

class Foo(BaseModel):
    count: int
	size: int = None # non required not nullable
	
completion = completions_parse_wrapper(response_format, params)


def completions_parse_wrapper(response_format, params):
    response_format_adapted = adapt_response_format(response_format)  
	
	completion = client.beta.chat.completions.parse(
		**params,
		response_format=response_format_adapted)
	
	completion.choices[0].message.parsed = remove_unset_values(completion.choices[0].message.parsed, response_format)
	return completion

def adapt_response_format(response_format):
	"""in this example, this would transformat the Foo format, changing `size: int = None` to `size: Optional[int]`, and create a Foo_temp BaseModel:
		class Foo_temp(BaseModel):
			count: int
			size: Optional[int]
	"""

def remove_unset_values(value, response_format):
	"""in this example, it takes an instance being of type Foo_Temp BaseModel, and it creates an instance of Foo BaseModel, with the size field set only if it is not None"""