Returning fixed length arrays with chat completion API

Hi all,

I have a task I’m trying to use the chat completion API for. I’m feeding in 2 comma delimited words, and I want to return a list of lists (of length 2), where the list elements are of fixed length 2. The first element in each list will be the input word, and the second element is sentiment (positive/negative/neutral). Spelling out the example, it’d be something like:

INPUT: happy, sad
OUTPUT: [[‘happy’, ‘Positive’], [‘sad’, ‘Negative’]]

Setting up a system message detailing the problem with some few-shot examples & reinforcing what the output structure looks like works well, but I noticed that you can use function calls in the API to enforce structure. I’m using a pydantic schema in the function call to ensure the list of lists is of size 2 (using pydantic’s conlist with a min_size and max_size of 2), but inputs are still skipped sometimes.

Anyone have any luck with similar tasks?

1 Like

Just curious, is there a reason you don’t just make 2 requests with one word each?

The JSON schema that is used as function output doesn’t directly support such lists while not including the various schema elements.

The only way to form that exact output as a function return would be as a single string, which would take just as much prompting but in the function description.

However, I have written a function to create such an iterable output that you can process with your code. Here’s the output:

"function_call": {
  "name": "report_sentiment",
  "arguments": "{
  "word_1": "happy",
  "sentiment_1": "Positive",
  "word_2": "sad",
  "sentiment_2": "Negative"
}"
}

and I’ve also made the function so the AI will go beyond what is directly specified:

"function_call": {
  "name": "report_sentiment",
  "arguments": "{
  "word_1": "bereavment",
  "sentiment_1": "Negative",
  "word_2": "joyous",
  "sentiment_2": "Positive",
  "word_3": "sarcastic",
  "sentiment_3": "Negative",
  "word_4": "exuberant",
  "sentiment_4": "Positive",
  "word_5": "triumphant",
  "sentiment_5": "Positive"
}"
}

How is it done? By just asking for the word and its sentiment as function parameters. The system prompt is just “Act only as sentiment classifier on input list.”

For my python, we just set this and add "functions": functions to the API call.

functions = [
{
    "name": "report_sentiment",
    "description": "Required function, report sentiment of each individual input word",
    "parameters": {
        "type": "object",
        "properties": {
            "word_1": {"type": "string", "description": "first word"},
            "sentiment_1": {"type": "string","description": "required", "enum": ["Positive", "Negative"]},
            "word_2": {"type": "string", "description": "second word. Continue writing function if more words"},
            "sentiment_2": {"type": "string","description": "required", "enum": ["Positive", "Negative"]}
        }
    }
}
]

Add “neutral” as an enum option also. enum limits output to only specific choices.

The language could even be minimized to just what works. No parameter descriptions were actually required except to make it automatically go beyond the two words correctly.

3 Likes

we’re trying to run this process on a large set of text in smaller batches - 1 at a time would definitely work and would be easier to QA, but it’d cost a lot more (there’s a fixed length string prompting the problem + variable output, len(variable output << len(system message)

this is very clever - thanks! will try it out and let you know how it goes.

Using functions, would a response be a string json?