System prompt on finetuning

I see examples on finetuning use system prompt with GPT-3.5. I also seen comments that a generic system prompt would not be necessary, because the model would learn from the finetuning data.

Has anyone tested, even in small scale, if the system prompt should be included or it is simply extra cost?

Think of something like this “You are an accurate translator. When users says something, return exactly the same text in English”. Should I repeat this in my finetuning dataset to give it an initial hint on what is expected, or should I skip it?

As I understand, if I use system prompt in training data, I also need to use it when doing completions, and it may lead to slight increase in costs, but perhaps only a few percentage. I may also give a much longer prompt, giving instructions on the exact localization I prefer, formatting etc. so it may be in ~10% of extra cost range.

2 Likes

What do you mean when you say training data. Custom gpts can’t be trained from what I know. I’d be very glad to be wrong. How do you train your gpt

I mean finetuning of gpt-3.5-turbo: https://platform.openai.com/docs/guides/fine-tuning I know I used slightly wrong term. It is not exactly “training a model” but finetuning the output for format and style based on my examples.

Based on some experimentation, I think the answer is: yes, we should repeat the system prompt in the dataset. That means we must use it when consuming the dataset, although we can also slightly change it for slightly altered purposes.

The reason is that the system prompt helps the model generate almost correct answer, and then finetuning has less work to adjust the response.

If the training dataset is big, then included the system prompt is probably unnecessary.

1 Like