I made an LLM call with structured outputs asking for a JSON with one key language
which can only take certain values:
LLM call parameters
"model": "gpt-4o-mini-2024-07-18",
"temperature": 0,
"response_format": {
"type": "json_schema",
"json_schema": {
"name": "classification",
"strict": false,
"schema": {
"strict": true,
"type": "object",
"properties": {
"language": {
"type": "string",
"description": "The primary language of the text content",
"enum": [
"english",
"french",
"japanese",
"korean",
"italian",
"german",
"spanish",
"chinese",
"polish",
"hindi",
"indonesian",
"russian"
]
}
},
"required": [
"language"
],
"additionalProperties": false
}
}
}
I used the following system/user messages:
Messages
[
{
"role": "system",
"content": "You are a helpful assistant that analyzes text and returns structured JSON."
},
{
"role": "user",
"content": "This is the content I pass to Chainlit:'\n Källa 1. [850.json](https://fiskarhedenvillan.com)\n\nKälla\n\n\nPrompt tokens: 1016, Completion tokens: 61, Total: 1077 (0.06 kr)\n"
}
]
I expected the language to be one of the specified values in my enum, but I actually received swedish
!!! It’s not one of the language I specified in the enum.
Same with gpt-4o-2024-08-06
.
How can I enforce the output value for the language
key to be specifically one of the given enum values?
I tried the syntax oneOf
to no avail.
JSON schema with `oneOf` syntax
{
"name": "classification",
"strict": false,
"schema": {
"type": "object",
"strict": true,
"required": [
"language"
],
"properties": {
"language": {
"type": "string",
"description": "The primary language of the text content",
"enum": [
"english",
"french",
"japanese",
"korean",
"italian",
"german",
"spanish",
"chinese",
"polish",
"hindi",
"indonesian",
"russian"
],
"oneOf": [
{
"const": "english",
"description": "Content primarily in English language"
},
{
"const": "french",
"description": "Content primarily in French language"
},
{
"const": "japanese",
"description": "Content primarily in Japanese language"
},
{
"const": "korean",
"description": "Content primarily in Korean language"
},
{
"const": "italian",
"description": "Content primarily in Italian language"
},
{
"const": "german",
"description": "Content primarily in German language"
},
{
"const": "spanish",
"description": "Content primarily in Spanish language"
},
{
"const": "chinese",
"description": "Content primarily in Chinese language"
},
{
"const": "polish",
"description": "Content primarily in Polish language"
},
{
"const": "hindi",
"description": "Content primarily in Hindi language"
},
{
"const": "indonesian",
"description": "Content primarily in Indonesian language"
},
{
"const": "russian",
"description": "Content primarily in Russian language"
}
]
}
},
"additionalProperties": false
}
}
MWE to reproduce: https://platform.openai.com/playground/chat?preset=hZDr2dMFgQpox5kEnw2BbrI2