Structured Output: Error "Invalid schema for response_format " persists even for valid json schema

digitaldrreamer · January 26, 2025, 2:32pm

I keep getting "Invalid schema for response_format ". Here’s my function with the schema for reference:

    const completion = await openai.beta.chat.completions.parse({
      // model: "gpt-4o-2024-08-06",
      model: "gpt-4o-mini",
      messages: [
        {
          role: "system",
          content: `You are a specialized AI job data refinement agent tasked with transforming raw job listing data into a meticulously structured JSON object matching the specified job schema.

Core Objectives:
- Validate and complete ALL required fields
- Ensure 100% compliance with the schema specification
- Maintain data integrity and accuracy
- Provide intelligent inference for missing information

Field-Specific Processing Guidelines:

1. ID and Slug:
- Preserve original ID exactly
- Create URL-friendly slug if not provided
- Ensure unique identifier consistency

2. Title:
- Standardize to professional, clear job title
- Correct capitalization and formatting
- Remove unnecessary decorative text

3. Description:
- Don't generate anything new. Use existing description, albeit with grammatical correction.
- Use only allowed HTML tags: p, b, h1-h6, i, t, br, ul, ol, li, div
- Ensure semantic structure with clear sections
- Maintain tone and clarity

4. Schedule:
- Map to exact schema enum: full_time, part_time, or contract
- Infer from context if not explicitly stated

5. Level:
- Categorize precisely: internship, entry_level, junior, midlevel, senior, executive, or cofounder
- Use contextual clues from description and requirements

6. Salary:
- Extract min and max salary as strings
- Identify and standardize salary currency

7. Categories and Tags:
- Map to a provided applicable category and subcategory enums
- Generate relevant, lowercase tags in order of relevance
- Identify key skills and perks

8. Dates:
- Use current date if posting date is unclear
- Apply standard expiration policy (typically 30 days from posting)

Validation Constraints:
- ALL required fields MUST be populated
- No additional properties allowed
- Skills limited to 10 or less, sentence case
- Perks as short phrases/words
- Consistent formatting across all fields

Inference Confidence:
- Prioritize accuracy and realistic estimation

Return the refined job object strictly adhering to the specified JSON schema, demonstrating comprehensive data processing and intelligent information enhancement.
PS: Return only the JSON object. Nothing more nothing less`,
        },
        { role: "user", content: JSON.stringify(dataObj) },
      ],
      top_p: 1.0,
      temperature: 0.2,
      store: true,
      "response_format": {
      "type": "json_schema",
        "json_schema": {
        "name": "job_schema",
          "strict": true,
          "schema": {
          "type": "object",
            "properties": {
            "companyId": {
              "type": "string",
                "description": "Unique identifier for the related company."
            },
            "id": {
              "type": "string",
                "description": "Unique identifier for the job."
            },
            "slug": {
              "type": "string",
                "description": "Slug must be URL-friendly."
            },
            "title": {
              "type": "string",
                "description": "Job title is required."
            },
            "description": {
              "type": "string",
                "description": "Must be in HTML tags with structured sections about the job. Example Structure: <div>...</div>"
            },
            "schedule": {
              "type": "string",
                "enum": ["full_time", "part_time", "contract"],
                "description": "Job schedule is required."
            },
            "level": {
              "type": "string",
                "enum": [
                "internship",
                "entry_level",
                "junior",
                "midlevel",
                "senior",
                "executive",
                "cofounder"
              ],
                "description": "Job level is required."
            },
            "min_salary": {
              "type": ["string", "null"],
                "description": "Minimum salary as a string (e.g., '20000')."
            },
            "max_salary": {
              "type": ["string", "null"],
                "description": "Maximum salary as a string (e.g., '20000'). Must be equal to or greater than minimum salary."
            },
            "salary_currency": {
              "type": ["string", "null"],
                "description": "Currency code (e.g., 'NGN')."
            },
            "apply_url": {
              "type": "string",
                "description": "URL to apply for the job."
            },
            "source": {
              "type": "string",
                "description": "Source of the job posting."
            },
            "type": {
              "type": "string",
                "enum": ["standard", "premium", "free"],
                "description": "Job type is required."
            },
            "category": {
              "type": "string",
                "enum": [...categories],
                "description": "Job category. Choose sensibly."
            },
            "subcategory": {
              "type": "string",
                "enum": [...subcategories],
                "description": "Subcategory under a category. Choose sensibly."
            },
            "tags": {
              "type": "array",
                "items": {
                "type": "string"
              },
              "minItems": 5,
                "maxItems": 30,
                "description": "Between 5 and 30 tags. Choose tags based on possible search or filter terms."
            },
            "perks": {
              "type": "array",
                "items": {
                "type": "string"
              },
              "description": "List of perks offered by the job. Only add perks explicitly mentioned."
            },
            "skills": {
              "type": "array",
                "items": {
                "type": "string"
              },
              "description": "Skills must start with capital letters."
            }
          },
          "required": [
            "companyId",
            "id",
            "slug",
            "title",
            "description",
            "schedule",
            "level",
            "min_salary",
            "max_salary",
            "salary_currency",
            "apply_url",
            "source",
            "type",
            "category",
            "subcategory",
            "tags",
            "perks",
            "skills"
          ],
            "additionalProperties": false
        }
      }
    }

    });

Categories and subcategories are both arrays of strings, which I didn’t add because of their sizes.

platypus · January 26, 2025, 2:38pm

Hi @digitaldrreamer ! How big are your enums (category and subcategory fields)? There are limitations on the total string size in the schema, including enums, and also limitations on number of items and total string size for all the enums.

digitaldrreamer · January 26, 2025, 2:54pm

Hey @platypus Thanks for responding. The categories are 27, and the subcategories are ~ 176. The total character count for categories is 277, and that of subcategories is 2640.
The docs state:

A schema may have up to 500 enum values across all enum properties.

For a single enum property with string values, the total string length of all enum values cannot exceed 7,500 characters when there are more than 250 enum values.

I’m nowhere near those limits.

Topic		Replies	Views
Possible Structured Output Schema Length Error Bug in API Bugs structured-output	2	669	October 23, 2024
Invalid schema for response_format: schema must have a 'type' key API response_format	3	118	March 19, 2025
Strict = true schema bug, max parameters Bugs api , json , function-calling , assistants-api , structured-output	6	893	September 26, 2024
Structured Outputs does not work❗ API structured-output	2	779	August 24, 2024
Extra required key error in response format for JSON schema? API fine-tuning	5	2079	October 14, 2024

Structured Output: Error "Invalid schema for response_format " persists even for valid json schema

Related topics