Question about using packing when fine-tuning


I am working on a text generation project that has short prompts and longish completions and I am a little confused about whether I should use packing or not for training.

The documentation for fine-tuning states:

Note that if you’re fine-tuning a model for a classification task, you should also set the parameter --no_packing

That seems clear enough. My use case is not a classification task. But when I run the fine_tunes.prepare_data command, it succeeds and gives me this advice in the response:

You can use your file for fine-tuning:
openai api fine_tunes.create -t "/content/songs.jsonl" --no_packing

Note that my dataset has 400 prompt-completion pairs.

Should I train with --use_packing or --no_packing for my use case? Or does it not matter much?

One of the benefits of packing is to speed up the fine tuning, but if you’re in no particular hurry, use packing=false, it will probably not have much of a difference either way.

I do not know much about what goes on behind the scenes when training, but in the playground, I do know completions are “better” if there’s a clear separation between examples. In this case, \n###\n works very well. Perhaps a packed training file with a stop sequence of \n###\n will have the same effect, but in my mind, packing=false will help separate examples from one another, which is why it’s good for classification tasks.

I’m also no exert on the data preparation tool, but perhaps --no_packing was just a default output and not meant as advice for your specific training set.

In short, my humble opinion is to train with packing set to false (or --no_packing) as it won’t do harm, may cause the training to take slightly longer and may give a slightly better result as when using packing = true.


Hi Carla, OK, great, thanks!

1 Like