Unable to get "weight" field to work

grady1 · June 19, 2024, 5:32pm

I’ve been trying to add the “weight” key to my dataset but I either don’t understand how it works or am implementing it wrong.

I’m following the documentation here: https://platform.openai.com/docs/guides/fine-tuning/multi-turn-chat-examples

My thought process is that I want to add some mistakes into the data where the assistant says something bad but will set the weight to 0 to not train on the specific message. The rest will stay the same.

After doing a fine-tune, it seems like I am instead actually training it to make the mistakes and that the weight parameter is not working as I intended, as if it was just set to “weight”: 1.

The way I tested it was to include a sample where the model outputted special tokens (ie. something that would not usually be outputted by the model, in this case: <API></API>), but I set it to a weight of 0. I thought this would mean that the fine-tuned model would not try to output those special tokens <API></API>, but it does, even though the weight was set to 0.

Should this be the case?

Here is a sample for reference:

{"messages": [{"role": "system", "content": "You are a AI customer service agent."}, {"role": "user", "content": "hello"}, {"role": "assistant", "content": "Hey there, I'm Hank. How's your day going?", "weight": 1}, {"role": "user", "content": "what is the date today"}, {"role": "assistant", "content": "<API></API>", "weight": 0}, ... the rest of the conversation ... ]}

harperg · June 19, 2024, 5:41pm

Bump! same issue! I trained with some samples at weight 0 but it just seems to still train on those tokens.

grady1 · June 19, 2024, 6:12pm

Or if anyone can point me to some more documentation of how the “weight” key should be used. I could only find that one snippet. Thanks.

fluxtah · June 19, 2024, 7:11pm

If weight is zero does it not just skip that assistant response in training, I think but don’t quote me that if you have a training sample:

SYSTEM, USER, ASSISTANT, USER, ASSISTANT

This is considered two training examples that break down to

SYSTEM, USER, ASSISTANT
SYSTEM, USER, ASSISTANT, USER, ASSISTANT

By saying:-
SYSTEM, USER, ASSISTANT(weight=0), USER, ASSISTANT

You are skipping example number 1 but only using example number 2.

Please correct me someone however this is how I understand it.

EDIT: I asked Chat GPT about this and it seems I am wrong so, would be nice to get clarification on this also! https://chatgpt.com/share/7fe6b71e-bf1b-4c29-bcea-42e94df54565

grady1 · June 19, 2024, 7:15pm

When I was testing, I had a least one “weight”: 0 somewhere in every one of my samples but it still seemed to learn to output those <API></API> tokens so it seems like it is being trained on those full samples to some extent even if a zero weight exists.

fluxtah · June 20, 2024, 8:31am

Yeh true I think the whole conversation is still taken as context

Topic		Replies	Views
How to assign weights during chat based fine tuning? API fine-tuning-problems , gpt-4o-mini	0	93	December 18, 2024
Weight at 0 reduce training tokens? API fine-tuning , api , pricing	2	389	April 19, 2024
What does the "weight" parameter do when fine-tuning API fine-tuning , data-preparation	0	384	June 18, 2024
Can the model give back weighted decisions if you give it weights in the prompt? API gpt-4	19	826	November 14, 2023
System message: how to force ChatGPT API to follow it API	11	25947	December 13, 2023

Unable to get "weight" field to work

Related topics