from typing import Optional, Union, List, Dict, Any
from pydantic import BaseModel
from openai import OpenAI
class Table(BaseModel):
title: str
columns: List[str]
data: List[Dict[str,int]]
class Passage(BaseModel):
text: str
table: Table
class Question(BaseModel):
passage: Passage
question: str
openai_client = OpenAI(api_key=userdata.get("OPENAI_API_KEY"))
max_tokens = 8000
response = openai_client.beta.chat.completions.parse(
model="gpt-4o",
messages=[{'role':'user','content': 'Output a question'}],
max_completion_tokens=max_tokens,
response_format= Question
)
response.choices[0].message.parsed
But running into the following error:
BadRequestError: Error code: 400 - {'error': {'message': "Invalid schema for response_format 'Question': In context=(), 'required' is required to be supplied and to be an array including every key in properties. Extra required key 'data' supplied.", 'type': 'invalid_request_error', 'param': 'response_format', 'code': None}}
Not sure if this is allowed, it’s not really adhering to a strict schema, since it would allow for ambiguous key-value pairs. Can you try removing this one from Table to see if it works?
I could remove it, but I absolutely need the data field, the list of dictionaries in the the data field contain the data that will populate the table.
And the thing is the keys aren’t fixed for this list (for e.g. it can be [{‘year’:2002},{‘pop’:1000}] for a question containing a table showing the population graph).
I was wondering if there was a way to make it output a list of dicts for data.
And thus a strict grammar cannot be built and Pydantic cannot be used.
The whole point is that the API can enforce the next key to be output by the AI in an object (the AI writes JSON, thus not a dict). Every key of an object also needs to be set within a “required” array of a JSON schema.
If you make your own JSON schema, place it in the instructive container for the API of “strict:false” along with its name, you can send it as an unenforced schema response_format that the AI doesn’t have to strictly follow, as it is then just an AI instruction.
Yes so then you have to use the “raw” JSON schema as Jay mentioned and not use the strict mode. You will then have to also do some additional validation on your response because there are no guarantees that the schema will be followed at all, but generally it will be .