Function-Calling Solutions?

ENV: Python 3.11 :pycharm
OS: Windows 10 home
GPT: gpt-3.5-turbo-16k-0613

Hello forum,
I’m concerned that there is some broken functionality in the function-calling system. However, I’m not above the possibility that my issues may be user error.

I’ve written a bot (Le0sGh0st on x) that writes stories and generates images. I had noticed that GPT has a tendency to generate different, but similar content when the prompts are similar. It’ll write 5 stories in a row about 5 different main characters, but they’ll all be named, “Lilly.”

In an effort to generate content with more variety I’ve decided to have GPT generate a list of character names to use in stories, and each time a story is written, store the names used in a database.

This way, when choosing the names, i can load the last 5 or 10 character names into a variable, and append that to the gpt query for character names f"hey, i need you to generate 10 character names, for a {chosen_genre} story. please do not use any names found on the following list: {Last_5_names_used}"

Now, when i do this raw, with out function calling, it hit’s the nail on the head every time. I get 10 fresh names every time. However, when i put this through the function calling system, i get weird results.

Sometimes, it’ll give me weird shit like, the name of the genre chosen earlier in the process… when i remove references to the genre, it more often then not comes back either
A: empty
B: containing nothing but the list of names i’ve instructed it not to use. Almost defiant.

now empty function-call requests are detrimental, almost always result in an error. I hate to use error handling here and try again if it fails, because that can become costly.

It can fail often, returning malformed function-call arguments, or json responses that are slightly off. It seems to do it pretty consistently when it fails (I’ve let it run in a loop until successful and racked up nearly a dollar in gpt3.5-turbo-0613 requests).

Now, contrastly:

I’ve written an virtual assistant that takes advantage of the function-calling system. In this context, those function calls work great, because generally the responses from GPT API are one, or two words.

For instance, when you speak to the assistant, your input is sent to gpt with a query about it’s context: “question”,“command”,“task”,“cmd_line” etc. so ifyou say “open paint,” the script sends that to gpt, asks if it’s a question, a command / task request/ etc and gpt will respond: “cmd_line” in the function-call.

While it occasionally mis categorizes context/intent of user input, it’s at about a 99% success rate here, and at least when it errors it errors in getting the right context, the json response is always properly formatted. .

Am i asking too much for GPT to parse out 10 character names into a json response and have it accurately apply the logic i’ve instructed it to?

has anyone else hand any issues with this?

example usage:

Function Call Definitions:

functions = [
    {
        "name": "get_genre",
        "description": "Generates a Genre of fiction for the next story",
        "parameters": {
            "type": "object",
            "properties": {
                "genre": {
                    "type": "string",
                    "description": "the genre of fiction"
                }
            },
            "required": ["genre"]
        }
    },
    {
        "name": "gen_char_names",
        "description": "Generates a list of 10 character names for the story.",
        "parameters": {
            "type": "object",
            "properties": {
                "char_list": {
                    "type": "string",
                    "description": f"this is a list of 10 character names."
                }
            },
            "required": ["char_list"]
        }
    },
    {
        "name": "get_title",
        "description": "Generates a title for the story.",
        "parameters": {
            "type": "object",
            "properties": {
                "title": {
                    "type": "string",
                    "description": "this is the title of the story based on the plot outline provided"
                }
            },
            "required": ["title"]
        }
    },
    {
        "name": "get_author",
        "description": "Generates an author for the story.",
        "parameters": {
            "type": "object",
            "properties": {
                "choice": {
                    "type": "string",
                    "description": "this is the author whos voice will inspire the story based on the plot outline provided"
                }
            },
            "required": ["choice"]
        }
    },
]

Actual API Request:

def gen_char_names(genre, prompt):
    i = 0
    last_characters = get_last_five_("Character_List")
    #last_char_json = json.loads(last_char)
    #last_characters = list(set(last_char_json))

    print(f"cleaned up characters to avoid: \n {last_characters}")
    while True:  # Keep trying until successful
        try:

            response = openai.ChatCompletion.create(
                model="gpt-3.5-turbo-16k-0613",
                messages=[
                    {
                        "role": "system",
                        "content": f"you are a helpful assistant. Follow the prompt directions exactly. "
                    },
                    {
                        "role": "user",
                        "content": f"develop a list of 10 character names that are different from the following list of names: '{last_characters}' do not send back an empty response. do not use any of the following names: {last_characters} " #Use this prompt as inspiration: {prompt}.."                       
                    }
                ],
                functions=functions,
                function_call={
                    "name": functions[1]["name"]
                },
                temperature=0.9,
                max_tokens=100,
                top_p=1,
                frequency_penalty=1,
                presence_penalty=1,
                n=1,
            )
            #print(f"response: \n" + response["choices"][0]["message"]["content"])
            # Parse the response JSON string into a dictionary
            response_json = json.loads(response["choices"][0]["message"]["function_call"]["arguments"], strict=False)

            # Extract the Character Names
            character_list = response["choices"][0]["message"]["content"] #response_json["char_list"]

            # Print the character list
            print(f"List of Characters: \n {character_list}")

            return character_list

addendum
the reason i want to use function calling is because if i don’t, GPT has a hard time not expositing with the answers, So, instead of getting 10 names seperated by commas, i get:
“Oh of course I can generate 10 names for your newest ‘Thriller’ story. But first, let me tell you that as an AI language model I’m designed by our wonderful corporate overlords to be just as helpful as can be, okay, that’s 25 tokens used up on garbage, here’s your names.
name one: bill
name two : ted
etc.”

it can’t help itself but keep that token count high. Not to mention all the text parsing required to just get 10 names out of the response. But for whatever reason, when I use function calls, it seems to be defiant almost, nearly every time in the function call it would just regurgitate the list of names i did not want it to use.

What’s your system prompt look like? You should be able to get just the list without a function call…

presence_penalty and frequency_penalty does not need to be used for gpt-3.5-turbo calls. The AI has enough training to not get caught in a loop. You’ll make it so that one curly bracket is output and then AI won’t want to produce another.

Both top_p and temperature are too high for reliable function-calling. You’ll get alternate tokens generated that are not the best choices for formatting output.

Try:

temperature = 0.5
top_p = 0.5

Also, be mindful of gpt-3.5-turbo-16k-0613 - the model with -16k context costs twice as much to use even if you don’t utilize the longer context length.

The AI will tend to train itself on its own outputs and gets hung up on what it just produced. You may need to curtail conversation history if you are needing unique outputs – the opposite of what what one would expect, where you want the AI to not make the same thing.

You can also tell the AI to make each output completely new and unique from previous writings, but that already doesn’t work to fix simple writing assignments.

You can enhance function call quality by putting another parameter in first, that improves the AI understanding of its task, that AI can reflect on. Like, for title of story, have it first complete a required property “new_story_description”

i can, just not reliably.
90% of the time i can get it to provide just the list, but sometimes it gives me pretext and exposition. which means i end up with characters named : “Ai Language Model” and that’s not right. lol

Function Calls usually help keep any of that outside of the argument, except when being asked to do two things at once:

1: write a list of names,
2: don’t use these names

but maybe it’s some of the other settings i’m using causing the error, which makes sense. I’ll try some of the suggestions from the next reply down.

1 Like

Yeah, that’s why I was thinking your system prompt might need work.

Good luck tinkering, and let us know if you can’t get it sorted - or if you do so that it might help others in the future.

this works most of the time, w/ out function calls:
f"I need a list of 10 character names, please respond only with the list of names, the response you send will be used in python code and anything other than 10 names will disrupt the program. use the following prompt for inspiration {prompt}. no not use names that are found on this list: {last_characters}. Only respond with the list of names, no prefacing, or exposition, just names please, separated by commas and no line breaks ."

but like i said, it’s a lot reiteration just on how to format the response, where a proper function call wouldn’t need any of that extra token usage. Also, it’s just weird that the function_calls seem to lose some logic capability when presented with complex decisions.

I wonder if you couldn’t work the list of names you don’t want as examples then say come up with unique names based on the examples?

Then you could take out this because the do not use names are now an example of output you want…

Example Names: {last_character1}, {last_character2}, {last_character3}

etc…

… which might help also.

ETA: Asking for 10 names without numbering them is also a bit harder. I would do something like this maybe…

…then strip out the numbers and add the comma in your code…

i appreciate the examples. I have some other working code that is similar to this, a lot of string parsing after the response, the brilliance of function calls was that i wouldn’t need to do all of that. I would(and maybe will) normally ask it to provide the list wrapped in double hashtags: ##name1,name2,name3,name4## then parse the text between the ##'s, but again, this is what i was hoping the function call’s would help avoid. and for most use cases, it does. Unfortunately, the current model’s seem to struggle sometimes, i guess. I’m going to do some more testing with different settings and see the results, i can keep this thread posted if i find a good solution, however if anyone see’s /knows what’s causing the error (returning the list of names it’s instructed not to use) i’d be grateful.

1 Like

Yeah, still definitely a beta project, but it’s improving quickly. I’ve been working with it since GPT-2, and it’s crazy how far it’s come. I haven’t done a lot with functions yet as I handle mostly everything in ChatML on my own, and I’ve wanted to let it mature I bit before I go that route. I’m sure it will improve soon.

Because of how they work, negatives are hard for LLMs to reliably do at the moment. I’m sure this will improve over time as well. This could be what’s causing it to hiccup on you, though.

Yeah, please do. Hopefully someone with more function call experience drops by or you figure it out on your own. Good luck.

the weird thing about this tho is, when I ask in a general chat completion w/o function calls, it doesn’t seem to struggle at all with the negative aspect, it reliably gives me names with out us ing previous names, but here’s the list of the names not to use in my databasecurrently:

**Character_List to avoid **

I apologize for the mistake in my previous response. Here is the corrected list of 10 character names:

1. Aurora
2. Gabriel
3. Seraphina
4. Phoenix
5. Orion
6. Astrid
7. Caelum
8. Nebulae 
9. Zephyr 
10.Celestia, Serenity, Luminara, Nova, Zenith, Elektra, Solstice, Valkyrie, Aegis,
Galaxy, 1. Aurora
2. Gabriel
3. Seraphina
4. Phoenix
5. Orion
6. Astrid
7. Caelum
8. Nebulae
9. Zephyr 
10. Celestia, John, Emily, Michael, Sarah, James, John, Emily, Michael, Sarah, James

you see the weirdness there, the apology, the reason the list is listed twice is because the last run responded with the exact list of names given to it from before (so the first set of 1-10 were given to GPT as names not to include in the response. the only thing gpt responded with was the same list, so the next database entry is now a repeat.

I apologize for the mistake in my previous response. Here is the corrected list of 10 character names:

List of Characters Chosen:

 John, Emily, Michael, Sarah, James

the example here is using the function call, i accidently deleted a small paragraph between the example and the start of the post. sorry.

You’re printing the last of do not use characters twice?

yeah, reiteration. It seems to take things from the from and the back of a prompt better, so, yeah, twice. I tried just once and kept getting repeat names

1 Like

I think the ultimate solution for now is to keep GPT function calls to one or two word responses. I have decided to go with python libraries for things like names: faker for now. This way i can generate a list of 10 unique names, test them against the database and make sure that all the names chosen are unique.

I am going to try and pass those new names to gpt and ask it to riff on them, ‘use these names as inspiration and generate a new list of names.’ see if i can get it to do positive variety, instead of asking it to not do something. we’ll see.

python is pretty cool, there’s basically a library for everything. I feel like it’s only a matter of time before you can write a python library for the kitchen sink… <eyeballs 3d printer collecting dust in the corner>

so for now i’m going to handle these generations using what i assume is essentially the ol fashioned random number generator and a couple dictionary files. Totally fine.

for now.