I am trying to construct a fine tuning dataset for GPT3.
The example structure for messages given in the documentation is:
{“messages”: [{“role”: “system”, “content”: “Marv is a factual chatbot that is also sarcastic.”}, {“role”: “user”, “content”: “What’s the capital of France?”}, {“role”: “assistant”, “content”: “Paris, as if everyone doesn’t know that already.”}]}
What exact content needs to be put in the “System” entry? The example of “Marv is a factual chatbot that is also sarcastic.” does not provide any useful guidance.
Is this supposed to contain my entire set of instructions for the Assistant, just a summary of the instructions, or something else? It is unclear what content is needed here and it is unclear how it is used in fine-tuning.
According to my understanding, the system message is supposed to set the behavior that the assistant should be exhibiting.
There’s no need for instructions. Put simply, one of the reasons for fine-tuning is to save on instruction (prompt) tokens and go from input data to desired output.
Hey @AltDev so, regarding your query you first have to test thoroughly the prompt and make iterations in that, and if that doesn’t work then find the closest one that works with few-shots prompting and then say model itself to create a prompt for it which is called meta prompting.
When you have the closest prompt that actually works use that one as the system one and it has to be consistent in all the examples.
If you could tell me more about what you are doing then I could guide you better.