I’ve been trying to add the “weight” key to my dataset but I either don’t understand how it works or am implementing it wrong.
I’m following the documentation here: https://platform.openai.com/docs/guides/fine-tuning/multi-turn-chat-examples
My thought process is that I want to add some mistakes into the data where the assistant says something bad but will set the weight to 0 to not train on the specific message. The rest will stay the same.
After doing a fine-tune, it seems like I am instead actually training it to make the mistakes and that the weight parameter is not working as I intended, as if it was just set to “weight”: 1.
The way I tested it was to include a sample where the model outputted special tokens (ie. something that would not usually be outputted by the model, in this case: <API></API>
), but I set it to a weight of 0. I thought this would mean that the fine-tuned model would not try to output those special tokens <API></API>
, but it does, even though the weight was set to 0.
Should this be the case?
Here is a sample for reference:
{"messages": [{"role": "system", "content": "You are a AI customer service agent."}, {"role": "user", "content": "hello"}, {"role": "assistant", "content": "Hey there, I'm Hank. How's your day going?", "weight": 1}, {"role": "user", "content": "what is the date today"}, {"role": "assistant", "content": "<API></API>", "weight": 0}, ... the rest of the conversation ... ]}