BadRequestError Invalid schema for function

IntelliJJ · January 10, 2024, 12:02pm

Hi all,

I am trying to take a resume (as text string) and make structured data out of it with open ai api.
e.g. I want an array for each work experience so I can use that for further processing.
I am combining a hard coded prompt with the resume text string and try to use function calling get a structured response in json. See code below. However, I am getting a bad request error with the following:

Error code: 400 - {‘error’: {‘message’: “Invalid schema for function ‘extract_values_from_resume’: In context=(‘properties’, ‘opleiding’), array schema missing items”, ‘type’: ‘invalid_request_error’, ‘param’: None, ‘code’: None}}

Here is my code:

from openai import OpenAI
from dotenv import load_dotenv
import os

dotenv_path = '/config/settings/.env' 
load_dotenv(dotenv_path) 
dotenv_path = '/config/settings/.env'
load_dotenv(dotenv_path)
api_key = os.getenv("OPENAI_API_KEY")
hardcoded_prompt = """
   Retrieve specified values from the source text. Indicate the absence of information with '#####'. Handle multiple data occurrences as arrays. Return answer as JSON object. Here is the source text:
   
   """

def prompt_open_ai(extracted_text):
    api_key = os.getenv("OPENAI_API_KEY")
    client = OpenAI(api_key=api_key)

    tools = [
        {
            "type": "function",
            "function": {
                "name": "extract_values_from_resume",
                "description": "Retrieve specified values from the source curriculum vitae and export according to the JSON schema",
                "parameters": {
                    "type": "object",
                    "properties": {
                        "voornaam": {
                            "type": "string",
                            "description": "voornaam"
                        },
                        "achternaam": {
                            "type": "string",
                            "description": "achternaam"
                        },
                        "woonplaats": {
                            "type": "string",
                            "description": "woonplaats"
                        },
                        "profielomschrijving": {
                            "type": "string",
                            "description": "Stuk tekst waarin de persoon zichzelf beschrijft aan het begin van het cv"
                        },
                        "motivatie": {
                            "type": "string",
                            "description": "motivatie voor een specifieke baan of opdracht"
                        },
                        "opleiding": {
                            "type": "array",
                            "patternProperties": {
                                "een gevolgde opleiding": {
                                    "items": {
                                        "type": "object",
                                        "properties": {
                                            "naam": {
                                                "type": "string",
                                                "description": "naam van gevolgde opleiding"
                                            },
                                            "instituut": {
                                                "type": "string",
                                                "description": "naam van de school waar de opleiding is gevolgd"
                                            },
                                            "startjaar": {
                                                "type": "integer",
                                                "description": "eerste jaar van gevolgde opleiding"
                                            },
                                            "eindjaar": {
                                                "type": "integer",
                                                "description": "laatste jaar van gevolgde opleiding"
                                            }
                                        },
                                        "required": ["naam", "instituut", "startjaar", "eindjaar"]
                                    }
                                }
                            }
                        },
                        "certificering": {
                            "type": "array",
                            "patternProperties": {
                                "Beschrijving van een behaald certificaat": {
                                    "items": {
                                        "type": "object",
                                        "properties": {
                                            "naam": {
                                                "type": "string",
                                                "description": "naam van certificaat"
                                            },
                                            "instituut": {
                                                "type": "string",
                                                "description": "naam van de instantie waar het certificaat is behaald"
                                            },
                                            "eindjaar": {
                                                "type": "integer",
                                                "description": "jaar waarin certificering is behaald"
                                            }
                                        },
                                        "required": ["naam", "instituut", "eindjaar"]
                                    }
                                }
                            }
                        },
                        "werkervaring": {
                            "type": "array",
                            "patternProperties": {
                                "Beschrijving van een specifieke werkervaring": {
                                    "items": {
                                        "type": "object",
                                        "properties": {
                                            "startjaar": {
                                                "type": "integer",
                                                "description": "startjaar van werkervaring"
                                            },
                                            "eindjaar": {
                                                "type": ["integer", "string"],
                                                "description": "eindjaar van werkervaring. Zet 'heden' neer als de persoon hier momenteel nog werkzaam is."
                                            },
                                            "functietitel": {
                                                "type": "string",
                                                "description": "functietitel van werkervaring"
                                            },
                                            "bedrijf": {
                                                "type": "string",
                                                "description": "naam van bedrijf of organisatie waar deze werkervaring is opgedaan"
                                            },
                                            "plaats": {
                                                "type": "string",
                                                "description": "locatie van bedrijf of organisatie waar deze werkervaring is opgedaan"
                                            },
                                            "functieomschrijving": {
                                                "type": "string",
                                                "description": "Omschrijving van werkervaring, taken, verantwoordelijkheden, resultaten en overige informatie van deze werkervaring"
                                            }
                                        },
                                        "required": ["startjaar", "eindjaar", "functietitel", "bedrijf", "plaats", "functieomschrijving"]
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    ]
    response = client.chat.completions.create(
        model="gpt-4-1106-preview",
        tools=tools,
        tool_choice={"type": "function", "function": {"name": "extract_values_from_resume"}},
        #temperature=2,
        response_format={"type": "json_object"},
        messages=[
            {"role": "system", "content": "You are a machine that extracts specific bits of text and exports the exact quotes in JSON format."},
            {"role": "user", "content": hardcoded_prompt + extracted_text},
        ],
    )

    return response.choices[0].message.content
    print(response)

    return response

Any help would be appreciated, thanks.

jacob3 · January 12, 2024, 1:10pm

@IntelliJJ I’ll do my best to provide feedback – your Schema is complex and it’s in Dutch, which I do not understand.

Missing "required" property keyword:
You should specify which, if any, parameter properties are “required”.
In the example below, I’ve made all of them required. If none are required, then “required” should be an empty list.

image1534×1030 82.8 KB
Verify implementation of "patternProperties":
Your implementation of "patternProperties" looks odd. What exactly are you trying to achieve?
The error you provided points me in the direction of the source of the bug being here.
Quoting the error: In context=(‘properties’, ‘opleiding’), array schema missing items
When you define opleiding as being of type array, you should follow it up with items. See an example below from JSON Schema’s docs:

{
  "type": "array",
  "items": {
    "type": "number"
  }
}

Redundant response_format:
When using tools in OpenAI Chat Completion, the response_format is automatically set to {"type": "json_object"} – source. Therefore, you can remove this line of code: response_format={"type": "json_object"},.
Provide example text:
Could you provide an example input text you’re trying to extract data from?

zzbbyy · January 12, 2024, 7:23pm

Not directly addressing your question - but it seems that everybody generates their schemas from pydantic models - see JSON Schema - Pydantic

Some people then remove the ‘title’ fields from these schemas: Schemas for OpenAI functions parameters

IntelliJJ · January 16, 2024, 6:49pm

Hi @jacob3 , thanks for the thoughts, I will try my best to answer.

For context it is good to know I am trying to extract 4 types of data from the provided resume;
-key value pairs for personal info like names and phone numbers
-for each education I want to extract some details
-for each certificate I want to extract some details
-for each work experience I want to extract some details
For the last 3 I am specifying what details are required (main identifyers for each)

Now as to your questions;

I made everything required and want the LLM to put a placeholder in place to avoid empty arrays.
Well my reasoning was to use patternProperties (e.g. every job experience is a pattern) to explain to the LLM that I want the specified information for every e.g. education. In hindsight I am probably mixing up LLM instructions and python functions breaking the code.
Thanks for the link to the docs about JSON Schema’s, Ill look into that.
Right, didnt know that. Thanks.
Unfortunately I cant (its personal data). For input I upload a word document curriculum vitae, extract all text. That string is my input.

Thanks for your time, I think the answer might lay in #2.

IntelliJJ · January 16, 2024, 6:51pm

@zzbbyy Ill check it out, thanks for the link.

Topic		Replies	Views
Function calling doesn't adhere to the context provided API	2	971	December 23, 2023
Function Calling returns data i've provided in the prompt without doing its job API gpt-4 , function-calling	8	1872	October 27, 2023
Invalid schema error in function using arrays API question , gpt-4 , functions , function-calling	3	4905	October 11, 2023
API / Calling Function - Same output as input API function-calling	6	930	November 23, 2023
Quirk in OpenAI’s Function Call Schema: Single-Property Schemas Fail to Execute Bugs chatgpt	4	87	December 24, 2024

BadRequestError Invalid schema for function

Related topics