Function calling with fine tuned model

Here’s the original solution (or rather one possible workaround) I put forth:

Let’s discuss more.

Current fine-tune plus functions

When giving examples for a gpt-3.5-turbo fine-tune, you will typically provide a list of complete conversations, each of which will have at least:

  • a system message, where it is useful to assign your fine tune its own identity
  • a user input, typical of how your model is anticipated to be used
  • then an AI response, customized and different from how the AI would typically respond.

New: a special way of training AI to emit a function-call and seeing function call returns was introduced, also for fine-tuning the chat model in your examples. The server-side API recognizes these functions and puts them in a special response.

Completion model background

Earlier completion models are simply designed to continue writing where the input was left off. This could be finishing your sentence, or more interesting, you provide an introduction or a question, and then the AI would then write an answer as the next logical thing such a document would have.

To fine tune these models to better answer questions than using such trickery, we could insert a separator (such as “- - -”) after the user input, and then the AI is trained (or prompted) to write its response to a question after seeing such a division.

Chatting with completions

An interesting thing. We can take the AI model designed to “write the next thing” and tell it that what it will see is a conversation between two people. We can even describe one of the parties of the conversation as being an AI intelligence.

Then by giving each of these parties their own prefix, such as “human” and “AI”, or “user” and “assistant”, we can leave off where the “AI:” response should be the next thing written, and the AI writes like it is responding to the person.

Stop sequences

Now the problem: The AI doesn’t know when its done. It still somewhat sees that conversation like it was writing a document, so it could continue writing more user questions and AI responses, continuing until it runs out of things to say, or even summarizing and talking about the conversation just seen.

So we have to make it stop.

The API completion call can be given a stop sequence like “\nuser:”. If the AI writes that itself, the characters are recognized and the generation of output is terminated.

This also can be for training the AI to “wrap up” answers to just the length typical of an answer. If not in a chat context but a Q/A or data processor, we could fine-tune an AI to produce a different stop sequence not otherwise seen after a finished answer, like “[####]”, and recognizing that, we can also stop the completion so AI doesn’t keep on writing aimlessly.

Chat completions - containers for conversations

OpenAI had an idea - instead of letting the user insert their own text “AI” to fool the AI with fake messages seemingly written by AI or which would elevate their status, the messages sent to the AI would be wrapped in special tokens that cannot be represented with normal text (or those strings could be screened out).

The AI is trained with OpenAI’s fine-tuning that when it is done, not to produce a stop sequence like “user”, or “###”, but rather to output a special stop token that the API developer doesn’t even need to specify.

Broken fine-tune with functions

Why is the AI repeating the output and not emitting a stop sequence token? Something has gone wrong with the example messages when a function is used in them.

Something is wrong with the specification of functions. OpenAI doesn’t want you to know the precise AI language of function inputs and outputs either, so you have to use their mechanism in fine-tune. The AI is being trained on the wrong or no stop sequence, or being overtrained on a stop token only used for functions. It again keeps on writing.


We now know how we would have fine-tuned an older model to stop, and how OpenAI would fine-tune a new model to stop with a special token when you use their “chat completion” examples.

We can again re-introduce our own stop sequence to halt the output when OpenAI’s technique doesn’t work.

At the end of the AI response that you are fine-tune training the AI to produce, you can, not trusting that the job is being done right by OpenAI, add your own stop sequence back, like “######” as I suggest (or even a much longer unusual string still encoded to one token).

With enough training on AI responses ending with the new sequence (needing more regular conversations to be introduced back into your training set and not just functions), we can teach the AI our new stop sequence.

AI might produce that string and then still repeat, but by using the API parameter "stop": "######" for our own custom stop sequence, we can have OpenAIs servers identify when the AI produces that and halt the output before any repeats are seen.


  • Add the stop character sequence to the end of all “assistant” messages when also training on function-call;
  • Add more normal chat examples covering a broad width of topics with that “assistant” end sequence also as the last thing it writes;
  • Fine-tune your model;
  • Call your API model with the additional “stop” parameter;
  • Hopefully no more repeats.
1 Like