Help with using Structured Outputs for JSON Patch Generation and Handling the "value" Field in JSON Schema

Hi everyone,

I’m working on generating JSON Patch operations using Structured Outputs, and I’m running into an issue with defining the value field in my JSON Schema. As you know, the value field in a JSON Patch operation can be of various types: string, number, array, or object.

The problem arises when I’m trying to define the schema for this value field. Since it can hold different types, I’m not sure how to properly describe it in the schema so that it supports all possible types (string, number, array, object, etc.).

  1. How should I define the value field in JSON Schema to make it flexible enough to handle these cases?
  2. Is there a way to give AI some guidance on how to properly structure the JSON Patch operations? Specifically, I’d like to make sure that the op values (add, replace, remove, etc.) and path are correctly formatted and validated, and that the value field, when required, is assigned correctly based on the operation type.

I would greatly appreciate any insights or examples you can provide! Thanks in advance!

2 Likes

Hi @ginkiro !

One solution could be to treat value as an object. So you would predefine a bunch of “value” classes, e.g. using my pseudo-code here:

StringValue(BaseModel):
    value: str

IntValue(BaseModel):
    value: int

...

and then in your schema, define value as an enum that can take one of these pre-defined “value” objects.

I like the idea of using pydantic, but defining each type might lead to additional complexity since you would need to also account for all permutations of arrays and objects. eg: list[int], list[str], … Also, this could result in a confusing JSON schema as the union results in an anyOf key including an exceptionally long list of types.

A modified approach could look like this:

  1. Have the model return the JSON patch value as an object with two properties: value and python_annotation
  2. Create a function that parses the value with respect to the annotation and eval the lists and dicts.
  3. Pass this to pydantic.create_model to create a dynamic pydantic model which can be used to call the BaseModel.model_validate_json which will convert the string value back to appropriate (annotated) type.

Here is a (quick and dirty) little example I whipped up with GPT:


import json
import re
from typing import *

import openai
import pydantic


def convert_with_pydantic(annotation: str, value: str):
    Model = pydantic.create_model('Model', value=(annotation, ...))
    if '{' in value or '[' in value:
        value = json.loads(value)
    model = Model(value=value)
    return model.value



class Patch(pydantic.BaseModel):
    class PatchItem(pydantic.BaseModel):
        class Value(pydantic.BaseModel):
            value: str
            python_annotation: str = pydantic.Field(
                description="The type annotation for the `value` above. "
                "This will be used to convert the `value` to the correct type. "
                "List and objects should always indicate the type of their elements."
            )

        op: str
        path: str
        value: Value | None = pydantic.Field(
            description="null IF the `op` is 'remove' else a JSON value. "
        )
        
    patch_items: list[PatchItem]


system_message = """\
You are a JSON Patch Assistant. 
You evaluate an original JSON document and a result and create the JSON patch from the difference. 
Always create a patch item for the removal of keys in additional to all other operations (op).

"""

user_message = """\
original:
{
  "baz": "qux",
  "foo": "bar"
}

result:
{
  "baz": "boo",
  "hello": [1,2,3]
}
"""
messages = [
    {'role': 'system', 'content': system_message},
    {'role': 'user', 'content': user_message},
]

r = openai.OpenAI().beta.chat.completions.parse(
    model="gpt-4o-2024-08-06",
    messages=messages,
    response_format=Patch,
)

# %%
json_patch = r.choices[0].message.parsed.model_dump()['patch_items']


for item in json_patch:
    if item['value'] is not None:
        item['value'] = convert_with_pydantic(item['value']['python_annotation'], item['value']['value'])

print(json.dumps(json_patch, indent=2))
# [
#     {
#         "op": "replace",
#         "path": "/baz",
#         "value": "boo"
#     },
#     {
#         "op": "remove",
#         "path": "/foo",
#         "value": null
#     },
#     {
#         "op": "add",
#         "path": "/hello",
#         "value": [1, 2, 3]
#     }
# ]



I don’t think it will lead to any confusion, there aren’t that many permutations. You have 4 main types (Number, String, Boolean and Integer), and you have arrays of these four types, so that’s 8 elements in the enum. I’ve used 30 quite easily.

That’s fair, and I suppose it also has to do with the model to some degree. I did experience issues with several different types in a Union in the past with tool calling though, so I try to keep them to two types at the most Type|None, especially for mini. Also the unions don’t always translate well to OSS models since a lot of the training dataset only contain flattened schemas without anyOf. I’ve gotten into the habbit of giving the model the easy way out and doing heavy lifting on my end to reduce errors.

1 Like

Thank you for the clarification, but I think there’s still a bit more to consider. In addition to the four types you’ve mentioned, JSON Patch can also work with objects. I’m interested in using objects as well, and this introduces a bit of complexity.

I’m also not entirely sure what you mean by arrays of these elements. Could you clarify how arrays of Number, String, Boolean, Integer and Objects would be represented in the JSON Schema?