Json_schema for dynamic key names

Hi guys!

i am trying to create a json_schema for a query that will return N itens with the structure:

[{“information key”:“value”}]

as the “information key” will also be “invented” by the LLM, i do not know the name of it (it is dynamic), how can i define this case in the schema… chatGPT-4o was unable to solve it.

So far:

{
“type”: “array”,
“items”: {
“type”: “object”,
“additionalProperties”: {
“type”: “string”
}
}
}

class DynamicItemModel(RootModel):
    root: str


class ItemListModel(BaseModel):
    root: List[DynamicItemModel]


def conv_to_kv(list_values:List[ItemListModel]):
    k_v = []

    for list_val in list_values.root : 
        s_list_val = re.split(r":", list_val.root)
        k_v.append({s_list_val[0]: s_list_val[1]})

    return k_v

descriptions = ['Taylor Swift fans are called swifties. 1+1 = 2. In Maths we believe.']

def test_detect_category():
    client = OpenAI()


    for description in descriptions:

        try:        
            completion = client.beta.chat.completions.parse(
                model="gpt-4o-mini",
                messages=[
                    {'role': 'system', 'content': SYSTEM_PROMPT},
                    {
                        "role": "user",
                        "content": description
                    }
                ],
                response_format=ItemListModel,
            )

            detected = completion.choices[0].message.parsed
            print(f"{description} -> {conv_to_kv(detected)}")

        except ValidationError as ve:
            print(str(ve))


test_detect_category()
Taylor Swift fans are called swifties. 1+1 = 2. In Maths we believe. -> [{'Claims': ' Taylor Swift fans are called swifties.'}, {'Facts': ' 1+1 = 2.'}, {'Beliefs': ' In Maths we believe.'}]
1 Like

Thanks for your reply…but i dont see how this is an answer…

i need the json_schema for the json i provided above.

[{“information key”:“value”}]

the LLM will return a json like this:

[{“property_value”:“200000”},{“property_url”:“https://test”}]

@costamatrix I don’t think this is actually possible with the json_schema. I played around with this, and it spits out a BadRequestError because the response format requires you to explicitly state every key. But this is exactly what you don’t want to do in your case.

So the best thing to do is just use JSON mode, and just state in your prompt that you would like a JSON output consisting of a list of key-value objects, e.g. [{“information key”:“value”}]. And then in your code just use Pydantic (or similar) validation to make sure that you indeed have a list of key-value objects. This is the simplest solution IMO.

1 Like