Can't use proper JSON schema while JSON example works

Stack used: Python, Flask App, plain JavaScript in the frontend

Yeah, I know this title is kind of mystic, but I honestly don’t know how to phrase it any better. Here’s my situation:

  • I have a vector of a file I uploaded. I added this file to an agent. The agent is set to “file search” and response_format “auto,” with the instructions. I also provide an example of the output I expect, like:
... some important instructions ...
example answer:
 {
    "KEY1": "VALUE1",
    "KEY2": [
        {"KEY3": "VALUE2", "KEY4": "VALUE3", "KEY5": "VALUE4", "KEY6": "VALUE5", "KEY7": ["VALUE6", "VALUE7", "VALUE8"]},
        {"KEY3": "VALUE9", "KEY4": "VALUE10", "KEY5": "VALUE11", "KEY6": "VALUE12", "KEY7": ["VALUE13", "VALUE14"]}
    ],
    "KEY8": None
}
  • I create a thread, add a message with a starting prompt, and run the thread with streaming enabled because streaming is what I want.

The API now returns chunks of the JSON, which I have to manually “sanitize,” not a problem at all. Eventually, I have the streaming mechanism implemented successfully.

BUT… here it comes:

The API does not respond with proper JSON; it’s using single quotes instead of double quotes. I assume it’s because I am providing this “quick’n’dirty” JSON schema within the instructions only.

So I tried going the “official” way:

I tried to define the response schema and edit either the assistant or provide it at “run time” via “response_format”:

Something like this:

response_format={"type": "json_schema", "json_schema": response_schema}

There are two ways to define the response_schema:

The easy way:

response_schema = {
    "name": "ConfigurationAssistant",
    "schema": {
        "type": "object",
        "properties": {
            "KEY1": {
                "type": "string"
            },
            "KEY2": {
                "type": "array",
                "items": {
                    "type": "object",
                    "properties": {
                        "KEY3": {"type": "string"},
                        "KEY4": {"type": "string"},
                        "KEY5": {"type": "string"},
                        "KEY6": {"type": "string"},
                        "KEY7": {
                            "type": "array",
                            "items": {
                                "type": "string"
                            }
                        }
                    }
                }
            },
            "KEY8": {
                "type": "array",
                "items": {
                    "type": "object",
                    "properties": {
                        "KEY3": {"type": "string"},
                        "KEY4": {"type": "string"},
                        "KEY5": {"type": "string"},
                        "KEY6": {"type": "string"}
                    }
                }
            }
        },
        "required": ["KEY1", "KEY2", "KEY8"]
    }
}

Or the hard way:

class OpenAiResponseFormatGenerator(pydantic.json_schema.GenerateJsonSchema):
    # https://docs.pydantic.dev/latest/concepts/json_schema/#customizing-the-json-schema-generation-process
    def generate(self, schema, mode="validation"):
        json_schema = super().generate(schema, mode=mode)
        json_schema = {
            "type": "json_schema",
            "json_schema": {
                "name": json_schema.pop("title"),
                "schema": {
                    "type": "object",
                    "properties": json_schema["properties"],
                    "required": json_schema.get("required", [])
                }
            }
        }
        return json_schema


class StrictBaseModel(pydantic.BaseModel):
    model_config = {"extra": "forbid"}

    @classmethod
    def model_json_schema(cls, **kwargs):
        return super().model_json_schema(
            schema_generator=OpenAiResponseFormatGenerator, **kwargs
        )


class Suggestion(StrictBaseModel):
    KEY1: str
    KEY2: str
    KEY3: str
    KEY4: str

class ConfigurationAssistant(StrictBaseModel):
    KEY5: str
    KEY6: list[Suggestion]
    KEY7: list[Suggestion]
    required: list[str] = ["KEY5", "KEY6", "KEY7"]

response_schema = ConfigurationAssistant.model_json_schema()

# Adjust the generated schema to match the desired dictionary structure
response_schema = {
    "name": "ConfigurationAssistant",
    "schema": response_schema["json_schema"]["schema"]
}

Running a thread now never leads to an end. Either I get messages like “active run,” or it just runs into timeouts.

ReadTimeout: The read operation timed out

or

“message”: “Can’t add messages to thread_abc123 is active.”, ‘type’: ‘invalid_request_error’

Apparently, I am doing something wrong with my response schema, and I tend to stick to my quick’n’dirty solution as it is working quite fine.

But also, I want to understand what is the proper way to define a neat response JSON schema?

Your schema is really betraying the concept of a schema.

You have a structure, but it’s just a bunch of wildcards. It’s no surprise that the model times out, it has no idea what to do.

Why don’t your keys have names? What are you trying to do, exactly? If you have nameless keys, you want an array, not an object. It would be a good idea to include examples and descriptions for the model

In the future you should use the OpenAI Playground to mess around with the schema and possible entries before deploying it inside your own environment.

I would also suggest collaborating with ChatGPT to build a more appropriate schema.

I suppose you put placeholders for your keys and other things because you don’t want to expose your application idea. Because if that’s not the case, use schema is kind of confusing.

Good place to start would be the open AI playground and ask AI to generate schema for you and then compare the schema with your own to see what you have missing.

Thanks @mat.eo, thanks @sergeliatko

You’re right - I obfuscated the keys for given reasons, I thought it would make it easier to read here and, of course, not expose details about what I am building here.

However, I got the schemas from ChatGPT (and Claude… and Copilot, kind of a team effort), so I assumed they were perfectly right. Examples are provided in the pre-prompt instructions for the agent, as stated above.

But if I understand you correctly, my approach is not wrong; it’s actually just something with the schema? Will investigate that.

I can’t really use the playground, I got the keys but they are managed by another account, will the Playground provide more information, like more verbose details? Right now I am working with a Jupyter notebook to develop the process, which is quite good for debugging, but in this case doesn’t provide any additional details.

Ah, in that case it’s hard to understand what went wrong. LLMs can be robust, but also nitpicky in certain areas. For the best help having the verbatim information is really essential.

You can add the descriptions and examples directly into the schema, this may help.

There’s no definite answer as there’s very little provided here.

The playground just eliminates a lot of the variables involved in debugging. It’s safe to say that if it fails in the playground then it’s a prompt issue.

Yes, the timing out would indicate that the AI went on a path of token insanity. Not being able to produce a valid JSON object really supports the notion that there’s something wrong with your schema, and prompt.

A good idea would be to switch to ChatCompletions with streaming, use the response format of json_object, and see what’s output. I have a feeling you would find that it’s an endless stream of repetitive tokens.


I am a little suspicious that the examples you are providing are not proper JSON objects. Are you passing these examples as serialized Python dictionaries by any chance?

There is a tool in playground that uses AI to generate correctly formatted schemas, I don’t think you need to pay to use that tool.

There is a tool in playground that uses AI to generate correctly formatted schemas. I don’t think you need to pay to use that tool.

Yeah, but it requires me to top up my credits. If I don’t move forward, I will do that.

I am a little suspicious that the examples you are providing are not proper JSON objects. Are you passing these examples as serialized Python dictionaries by any chance?

Both - the dictionary and the class schema - are passed as they are. As soon as they are not valid syntax-wise, client.beta.assistants.update returns an error.

A good idea would be to switch to ChatCompletions with streaming, use the response format of json_object, and see what’s output. I have a feeling you would find that it’s an endless stream of repetitive tokens.

Will do that and turn off streaming for testing.

There’s no definite answer as there’s very little provided here.

But, just in a general way: Streaming + Assistant + JSON-Response? The reason why I am asking is: All Assistant-endpoints are still beta and my exact use case is not handled anywhere in the docs.

Thanks so far, will get back with my results…

1 Like

Well… what a ride, feels like #nasal-demons. I tried a couple of things now, and the changes sometimes are really, really atomic.

Like I removed the explicit mention of the vector store name in the pre-prompt, which sometimes led to an infinite run. I also fixed JSON key references: I used different ones in my pre-prompt, and they didn’t match the ones in the source data (this is the only apparent bug I found). I made minor changes to the wording of the iterative process. I am using one example schema instead of two in the pre-prompt. And so on… “nitty-gritty” hits the nail on the head.

The most annoying thing is: It works perfectly as long as I just use the JSON from the pre-prompt. As soon as I add a JSON schema, the whole processing goes south. The good thing is: It forces me to build a robust solution.

The second most annoying thing is that I don’t get reproducible or helpful responses. It’s either a “time out” or (literally) “Sorry something went wrong” or “There’s an active run.”

What I learned is this: Start with the simplest prompt you can imagine and then add more complexity in small steps, testing every step at least two times.

Another important thing to consider: Restart you Jupyter. I have the exact same code in two cells… one is running smoothly, the other one leads to errors from the API.

Thanks for your assistance :slight_smile: