GPT3.5 Fine Tuning System Message

I’m about to dive into Fine Tuning (FT) gpt3.5turbo but am still uncertain on how to handle the system message once the FT model is in production.

Most examples I see for FT use the same System message in every FT example in the dataset… Does this mean once the model is FT’d that that portion of the system message is no longer needed as it’s essentially baked in? On the flip side, if it is needed, then can you append to the System message to include more directions that weren’t necessarily the focus of the FT job and still reap the enhancements from the FT’d model?

For example, if you fine tune to be sarcastic (described in the system messages and examples), then when interfacing with the FT’d model can you give a totally different system message and it will still be sarcastic?
Alternatively, can you give the FT’d model for this example a system message of “be sarcastic and consise”, and it benefit from the FT dataset?

Otherwise, it would suggest that you must always use the exact same System message in production as was used in the examples.

Just want to make sure I really understand this before undertaking the time investment.

It is good to ponder the results of fine-tune before embarking on fine tune techniques.

fine-tune of a chat model - somewhat unique to OpenAI

gpt-3.5-turbo tuning is a new frontier with no documentation or advice how to do it beyond, “here’s three sarcastic AI comments” (and a system prompt that is all that is needed to create that behavior.)

Fine-tune is at a base level training on token sequences to increase the probability of patterns and paths. This is backed by a lot more intelligence than what you see or impart, allowing inference of untrained cases.

Chat is unique in that the messages are contained in roles and special tokens, and the model is highly tuned already. Whereas with a base model, an AI might easily simulate the next part of a conversation between Ashley and Brittany, there is a rigid chat framework applied to what you already get, plus your own containered training, that implies “here are user inputs, write your outputs, and then you’re done.”

what to do then?

So to get right down to it: System message is best used for an identity. You tune and then use that same system prompt. Then instead of your fine-tune being overlaid over a ChatGPT chatbot, you immediately depart into your own operation.

(We don’t know what the “default, basic” system message is that the AI is tuned on, but you can bet if not sanitized, there’s a lot of “You are ChatGPT, a large language model” - and then a whole bunch of OpenAIs fine-tune that makes it act as it does. Competing against that weight directly will be hard.)

Are you training on a long system prompt? No: ideally a reduced one where a short message is going to divert the AI to a different training destination. What you are training on is example responses. An AI that talks like a pirate can be fine-tuned by example, with no mention that “Roger the AI” likes to talk in pirate-speak.

That’s mostly the point of fine-tune, being able to do things that no amount of prompting could really arrive at by tuning new AI behavior so it doesn’t need prompting. gpt-3.5-turbo doesn’t need a prompt any more to deny you bong rip tips for toddlers.

There’s also logic and thinking behind a lot of AI applications, where no amount of example lets AI see what the black box is doing, and prompt will still be beneficial or the best solution. So train that and use that too.

So to summarize: Demonstrate the behavior, don’t describe the behavior. Save yourself the tokens in use. Your system prompt and user input should elicit the output style you’ve tuned on. “Marv” will know he is sarcastic without OpenAI’s extended prompt example.

1 Like