I’ve tried to train GPT3.5 1106 and 0125 on azure with 30-40 row examples of 4 different scenarios.
The result that I got is a model where very often responses are repeated, for example I get this output:
Perfect! Can I ask you if you would like to order for home delivery, takeaway or book a table? Perfect! Can I ask you if you would like to order for home delivery, takeaway or book a table?
Instead of simply giving me back it once.
I know that the number of examples is really low, I’m already planning to remove similar name and enhance the dataset, but i was not expecting this behavior coming out from a fine tuning.
Ok, interesting. You already included a higher frequency penalty. You might want to give it a try to also add presence penalty, perhaps starting with a value of 0.5
This has to do with stop sequence. Stop sequence is a token or set of tokens that you may have appended at the end of the each assistant training data sample.
Stop sequence are used during fine-tuning by appending them to the end of expected assistant response.
When you consume the fine-tuned model, the same stop sequence is passed via the stop parameter.
You can also figure out a stop sequence of all your assistant messages have a common token they end with, ideally this token wouldn’t appear anywhere in the generated text except the ending.
Because it is not a thing that is required if tuning the chat completions model. The stop token is already built into the format and should be trained, and trained with lots of repetitions of it.
However, especially when doing fine-tune that includes functions as examples, it seems damaged. Just one more thing that OpenAI has left unaddressed. It is especially broken in that the AI doesn’t continue into producing a “user” token, it repeats its own output.
The normal stop tokens are 100265 and 100260, depending on what the AI is emitting.
You’ll need to include a bunch of normal conversation outside of functions, and then end with your own stop token sequence like @!@!@!@!@ in training. Then you can stop at @!@ with the API call.
Yes I’ve seen that for function call tuning everything seems to be a little messed up, from the naming convention (tool call uses tool_call_id and finetune uses function_name).
That would be the idea. I use @! as they can’t be joined with each other. However, upon reflection, the AI might get confused about which to produce first if training on a bunch of them, so maybe just one instance of the sequence. You can also do something like ########, uniformly and always the same length, which is in fact a single token.
I’ll try to add ######## at the end of each assistant message, then retrain it and I’ll need also to add ######## to the stop sequence while doing inference right?
Yes, and you shouldn’t need to exactly match the token number as a stop sequence, so simply #### will catch many alternate tokens if the AI tries to write it differently or shorter. This will prohibit the use of the extended sequence of pound sign in the AI output, but the only place that is probable is in repeating back code.
Thank you so much, I’ll give you a feedback on how the process will go.
UPDATE:
It seems to work quite better, it still have some repetition but I think that is because of the small dataset from which the model is finetuned. I’ll try to enhance that in the next days.
Thanks for the support