How to make function calling return array as long as I want?

I’m using the gpt-3.5-turbo api to generate synonyms based on user input.
I need 10 synonyms per word, and so I’m instructing the model: both by simply stating give me 10 synonyms, and by requiring this amount using the minItems-maxItems property of json schema.
It’s not working. At best, it’s giving 3.
Bear in mind that before, when I used only structured text (told it to return 10 synonyms seperated by tildes [~]) and not function calling, it worked fine. Sometimes it flopped (which is why I’m moving to functions), but generally fine. Now it just seems to ignore my strict instructions, which is puzzling since I literally specified amount “by the book” using json…
I must stress that everything besides works charm. I’m getting exactly the parameters I want, just not the amount needed.

Anybody knows how to make the api listen to me, and return as many objects as I want?

Here’s the code (ignore syntax oddities; I’m using a php wrapper so it looks a bit weird but it’s working):

    'functions' => [
        [
            "name" => "get_synonyms",
            "parameters" => [
                "type" => "object",
                "properties" => [
                    "original_word" => [
                        "type" => "string",
                    ],
                    "synonyms" => [
                        "type" => "array",
                        "items" => [
                            "type" => "object",
                            "properties" => [
                                "synonym" => [
                                    "type" => "string",
                                ],
                                "synonym_popularity" => [
                                    "type" => "string",
                                ],
                                "synonym_style" => [
                                    "type" => "string",
                                    "description" => "is it slang, or formal, or in between?",
                                ],
                            ],
                        ],
                        "description" => "10 synonyms.",
                        "minItems" => 10,
                        "maxItems" => 10,
                    ],
                ],
            ],
        ],
    ],

Only the description will be seen. Other JSON-type qualities, such as "minItems", string formats, etc, are not passed to the AI.

Additionally, descriptions of inner array items are not passed.

Even “required”, seen in examples, doesn’t seem to have an AI language interpretation.

Consider the AI not likely to easily construct and fill new elements of an array, but only to fulfill the structure of parameters, as described.

Lets enhance:

{
    "name": "give_word_synonyms",
    "description": "Supplies formatted synonyms to the user for any synonym request",
    "parameters": {
        "type": "object",
        "properties": {
            "synonym_csv": {
                "type": "string",
                "description": "AI must discover exactly ten (10) synonyms, comma-separated"}, 
            "synonym_usage_csv": {
                "type": "number",
                "description": "AI must rank each of 10 selected from 0=rare to 10=common, comma-separated"},
            "synonym_style_csv": {
                "type": "string",
                "description": "AI must classify each of 10 selected from [formal, informal, slang] , comma-separated"
            },
        },
        "required": ["synonym_csv","synonym_usage_csv","synonym_style_csv"]
    }
}
1 Like

First off, this is an extremely informative response. Thank you for that.
Function calling seemed to me the ideal way to make AI return structured response the way I determine (and not to handle extraction from text and stuff like that). Bummer it is not so. Are you sure there is no other way to determine a length of an array? (even if not by using minItems)
It makes arrays quite useless it seems…
Thanks anyway! I will try your suggestion too, even though I’ll try to find a workaround of handling extractions first.

If the first level is type array, such as the first string entry switched to array, then you can use the description also to indicate the size. But then that’s kind of broken as soon as you only put one string inside of it instead of ten. The ten strings you could list out as synonym_1_style, …

@_j is it documented officially by openai? (Not that I don’t trust you, because my experience completely matches what you say, but I’d just like to see the doc)

There’s no docs about what will or won’t work, OpenAI just links to a JSON schema page and says basically “good luck”.

I wrote up what you get for options that make it to the AI.

3 Likes

Thank you very much, it is very informative :slight_smile: