Weight at 0 reduce training tokens?

Does anyone know if using “weight” = 0 in the training file for finetuning reduces the number of tokens? I imagine it does, since it doesn’t train with that message.

Can someone confirm?

For reference, there is a new parameter that one can pass on a particular message within an example conversation of chat completions training:

To skip fine-tuning on specific assistant messages, a weight key can be added disable fine-tuning on that message, allowing you to control which assistant messages are learned. The allowed values for weight are currently 0 or 1.

You are right, the method that would be employed here is unclear, and why there would be any value of true “skip” beyond just removing the message yourself if the AI model was not passed the tokens for training. The only thing a 0-weight might do is break up messages so they are trained more individually instead of on a sequence of runs that starts with or continues into another message - thus requiring the tokens of no impact.

I suspect that such a mechanism could have future or internal (special partner) use as a float value, passing a learning rate per message token, and would not disable the tokens or the tokens count, just the algorithmic impression that the tokens make on learning, by the very definition of weight.

1 Like

Thanks for your answer, but I was testing once the finetuning of the model was finished, the number of training tokens if it goes down when using “weight”: 0.

And about why use the “weight”: 0, basically if you work with RAG, where for each user message you change the system message, each model response has a different context