"System" message after fine-tuning

Hey,

Quick question about fine-tuning GPT. Once I fine-tune a model with a specific persona and writing templates, do I still need to send the long “system” message (with instructions about the persona) in every API call?

Is there any way to skip sending the system message after fine-tuning, or is it always required?

Thanks!

It’s not always required, but it’ll definitely boost your performance on the task if you send the system prompt with the messages as well.

It lets the model unequivocally know that that is the case to follow, so the responses will definitely more in line with the finetuning done

Welcome to the Forum!

I’ll put it like this: When you use your fine-tuned model, you still need to include the instructions you used during your training. Leaving that out entirely would result in nonsense output.

You might however be able to play around a bit and see if there are certain details you may be able to leave out without impacting the output. But the core instructions must still be included whether they form part of a system message or part of the user message - the model still needs to be told what it is supposed to do.

For comprehensiveness I’ll add an example where I tested this specifically:
One of my fine-tuned models is for a classification task. As part of the training data I included a system message that instructs the model to classify a text into one of several pre-defined categories and then lists the categories to choose from. In the user message I include the text subject to classification.

If I leave out the system message during the consumption of the model, it will not return a classification. Instead it will just provide a random response, commenting on the text provided. However, if I include the original system message, then the model performs as intended.

This might be an extreme case but it reinforces the importance of keeping the core instructions.

3 Likes

Welcome @clmsvie

One of the goals of fine-tuning is to save on instruction tokens, thereby setting the model’s behavior through training data by showing the model examples of how to respond to a specific structured context.

Hence, if you’re fine-tuning the model to do one special task, you may skip the long system message or use a shorter one.

Finally, you’ll have to supply whatever structure you fine-tune the model with, in order to best utilize the model.

1 Like