Function calling with fine tuned model

sardararslan033 · October 23, 2023, 6:31pm

Hi, I am having trouble with using function calling with fine tuned models. I used different approaches, first i fine tuned the model on prompt response pair and tried to use function calling during inference. That didn’t work, the response had an issue with the stop token I guess because it gave multiple answers to the same question. I tried to add functions along with messages in fine tuning dataset, it didn’t work either. Then i tried a hybrid approach added functions list in messages where functions are called and left it out in others, that didn’t work either. So, can anyone kindly tell me how to use function calling with a fine tuned model. Thanks

artistsinbusiness · October 24, 2023, 2:45am

Function calling is not yet available for fine tuned models

_j · October 24, 2023, 3:27am

Function calling is now available for fine tuned models.

A success story of someone actually using fine-tuning of functions productively and successfully is what’s not yet available…

artistsinbusiness · October 24, 2023, 3:43am

Oh cool. Okay, I’ll give it a try and let you know. Thanks, I wasn’t aware it was available on fine tuned models

sardararslan033 · October 24, 2023, 6:50am

Is there any resource or any ideas anyone would like to share on topic? Thanks

SJ11 · October 24, 2023, 12:42pm

I’m facing the same issue, by the sheer number of people who reported the same issue, it’s definitely a bug.

Maybe if a lot more people report the same issue it will get noticed, so far it seems to be ignored.

artistsinbusiness · October 24, 2023, 2:01pm

Functions work as normal for 3.5, I just switched out the model to my fine tuned and functions still work the same for me.

kjordan · October 25, 2023, 6:46am

@sardararslan033 Can you share some dummy code to reproduce the issue? I also tried function calls with fine-tuned models.

sardararslan033 · October 25, 2023, 11:31pm

Here you go @kjordan
Code:
response = openai.ChatCompletion.create(
model=“ft:gpt-3.5-turbo-0613:xxxxxxxxxxxx”,

messages=[
{
“role”: “system”,
“content”: system_prompt
},
{
“role”: “user”,
“content”: question_that_doesn’t_require_function_call
}

],
temperature=0,
functions = custom_functions,
function_call = ‘auto’)
response[‘choices’][0][‘message’][‘content’]
Output:
answer_1 (correct answer) + \n + answer_2(useless).

If i remove the last 2 lines, the output is correct.
lines:
functions = custom_functions,
function_call = ‘auto’)

_j · October 26, 2023, 1:59am

Here’s something to try, the function model may have been trained on stop tokens that are not used by the particular gpt-3.5-turbo endpoint but are used by other “Chat with GPT” products.

Add to your exact same API call after model:

stop = ["<|im_end|>", "<|fim_suffix|>", "<|endoftext|>"],

See if that makes the AI stop where expected instead of continuing to a second answer. Whether it works will depend on if the “stop” sequences are token-encoded and what they are trying to stop.

sardararslan033 · November 9, 2023, 1:52am

This bug has still not been resolved. Is anyone from openai looking into this?

wmodes · November 17, 2023, 4:04am

Here is a working demonstration of the problem with the following python script and my tuned model:


import openai
import config
import mysecrets

# Set your OpenAI API key and organization (if applicable)
openai.api_key = mysecrets.OPENAI_API_KEY
openai.organization = config.OPENAI_ORG

chatParams = {
    "model": "ft:gpt-3.5-turbo-0613:artist::8LhakJy8",
    "temperature": 0.7,
    "messages": [
        {"role": "assistant", "content": "What can I help you with today?"},
        {"role": "user", "content": "yo"},
    ]
}

print("ChatCompletion results with fine-tuned model:")
print(openai.ChatCompletion.create(**chatParams))

chatParams["functions"] = [
    {
      "name": "get_current_weather",
      "description": "Get the current weather in a given location",
      "parameters": {
        "type": "object",
        "properties": {
          "location": {
            "type": "string",
            "description": "The city and state, e.g. San Francisco, CA"
          },
          "unit": {
            "type": "string",
            "enum": ["celsius", "fahrenheit"]
          }
        },
        "required": ["location"]
      }
    }
  ]

print("ChatCompletion results with fine-tuned model and functions:")
print(openai.ChatCompletion.create(**chatParams))

The results exemplify the problem:

% py broken-model.py
ChatCompletion results with fine-tuned model:
{
  "id": "chatcmpl-8LkKYXeeCoQXpz4czYbzSl7BxD3Lg",
  "object": "chat.completion",
  "created": 1700193674,
  "model": "ft:gpt-3.5-turbo-0613:artist::8LhakJy8",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Hello! How can I assist you today?"
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 20,
    "completion_tokens": 9,
    "total_tokens": 29
  }
}
ChatCompletion results with fine-tuned model and functions:
{
  "id": "chatcmpl-8LkKZDhXiyfe5yTCKiPjKWrZCfFQo",
  "object": "chat.completion",
  "created": 1700193675,
  "model": "ft:gpt-3.5-turbo-0613:artist::8LhakJy8",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Hello! How can I assist you today?\nHello! How can I assist you today?"
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 87,
    "completion_tokens": 20,
    "total_tokens": 107
  }
}

Interestingly, I ran it 10 times and it doubled the call with functions only twice. With my app (with functions included) it gave me two responses with every response every time.

sardararslan033 · November 17, 2023, 3:27pm

Add a custom stop token in the fine tuning dataset like #### at the end of every completion, and use stop=#### in the chat completion call. That’s how i solved this issue

wmodes · November 17, 2023, 11:29pm

Clever! This works!

I was able to retrain with stop words and all is fine again.

Thanks.

belleliao · November 22, 2023, 12:03am

How do you do “Add a custom stop token in the fine tuning dataset like ####”? I do not quite get it.

_j · November 22, 2023, 2:09am

Here’s the original solution (or rather one possible workaround) I put forth:

Let’s discuss more.

Current fine-tune plus functions

When giving examples for a gpt-3.5-turbo fine-tune, you will typically provide a list of complete conversations, each of which will have at least:

a system message, where it is useful to assign your fine tune its own identity
a user input, typical of how your model is anticipated to be used
then an AI response, customized and different from how the AI would typically respond.

New: a special way of training AI to emit a function-call and seeing function call returns was introduced, also for fine-tuning the chat model in your examples. The server-side API recognizes these functions and puts them in a special response.

Completion model background

Earlier completion models are simply designed to continue writing where the input was left off. This could be finishing your sentence, or more interesting, you provide an introduction or a question, and then the AI would then write an answer as the next logical thing such a document would have.

To fine tune these models to better answer questions than using such trickery, we could insert a separator (such as “- - -”) after the user input, and then the AI is trained (or prompted) to write its response to a question after seeing such a division.

Chatting with completions

An interesting thing. We can take the AI model designed to “write the next thing” and tell it that what it will see is a conversation between two people. We can even describe one of the parties of the conversation as being an AI intelligence.

Then by giving each of these parties their own prefix, such as “human” and “AI”, or “user” and “assistant”, we can leave off where the “AI:” response should be the next thing written, and the AI writes like it is responding to the person.

Stop sequences

Now the problem: The AI doesn’t know when its done. It still somewhat sees that conversation like it was writing a document, so it could continue writing more user questions and AI responses, continuing until it runs out of things to say, or even summarizing and talking about the conversation just seen.

So we have to make it stop.

The API completion call can be given a stop sequence like “\nuser:”. If the AI writes that itself, the characters are recognized and the generation of output is terminated.

This also can be for training the AI to “wrap up” answers to just the length typical of an answer. If not in a chat context but a Q/A or data processor, we could fine-tune an AI to produce a different stop sequence not otherwise seen after a finished answer, like “[####]”, and recognizing that, we can also stop the completion so AI doesn’t keep on writing aimlessly.

Chat completions - containers for conversations

OpenAI had an idea - instead of letting the user insert their own text “AI” to fool the AI with fake messages seemingly written by AI or which would elevate their status, the messages sent to the AI would be wrapped in special tokens that cannot be represented with normal text (or those strings could be screened out).

The AI is trained with OpenAI’s fine-tuning that when it is done, not to produce a stop sequence like “user”, or “###”, but rather to output a special stop token that the API developer doesn’t even need to specify.

Broken fine-tune with functions

Why is the AI repeating the output and not emitting a stop sequence token? Something has gone wrong with the example messages when a function is used in them.

Something is wrong with the specification of functions. OpenAI doesn’t want you to know the precise AI language of function inputs and outputs either, so you have to use their mechanism in fine-tune. The AI is being trained on the wrong or no stop sequence, or being overtrained on a stop token only used for functions. It again keeps on writing.

Solution

We now know how we would have fine-tuned an older model to stop, and how OpenAI would fine-tune a new model to stop with a special token when you use their “chat completion” examples.

We can again re-introduce our own stop sequence to halt the output when OpenAI’s technique doesn’t work.

At the end of the AI response that you are fine-tune training the AI to produce, you can, not trusting that the job is being done right by OpenAI, add your own stop sequence back, like “######” as I suggest (or even a much longer unusual string still encoded to one token).

With enough training on AI responses ending with the new sequence (needing more regular conversations to be introduced back into your training set and not just functions), we can teach the AI our new stop sequence.

AI might produce that string and then still repeat, but by using the API parameter "stop": "######" for our own custom stop sequence, we can have OpenAIs servers identify when the AI produces that and halt the output before any repeats are seen.

Use:

Add the stop character sequence to the end of all “assistant” messages when also training on function-call;
Add more normal chat examples covering a broad width of topics with that “assistant” end sequence also as the last thing it writes;
Fine-tune your model;
Call your API model with the additional “stop” parameter;
Hopefully no more repeats.

robechun · November 29, 2023, 1:20am

@_j so is fine-tuning with tools and tool_choice just not available at the moment?

forestwanglin · December 1, 2023, 11:06am

I just tried by code. It is supported. I can get the toolCalls from response when I use my fine-tuned model based on GPT-3.5-TURBO-1106.

_j · December 1, 2023, 11:27am

What is not supported is a training file method for specifying “tool” instead of “function”.

However, it is basically the same language that you are training on.

Tools is just the upper hierarchy they planned being exposed - and a mechanism now to not let you put roles into the AI context that you want, requiring a ID from the API of a call for “permission”.

Topic		Replies	Views
Fine-tuned model sometimes repeats itself verbatim Prompting	10	3684	November 6, 2023
Bad results when using fine-tuned model with function calling API fine-tuning , function-calling , fine-tuning-problems	15	4689	November 23, 2023
Preparing data to fine-tune function-calling model Documentation fine-tuning , fine-tuning-problems	13	6673	January 11, 2025
Fine-tuning and function calling API fine-tuning , functions , function-calling	8	3023	October 9, 2023
Function calling for list of user prompts Prompting function-calling	6	235	September 19, 2024