What's the best train/validate split for fine-tuning?

arun279 · November 25, 2023, 8:02pm

I know there can’t be a one-size-fits-all answer to this, but I am looking for the general principle.

Here’s some information about my project:
I am creating a fine-tuned model for a specific scenario, the “system” message for all the training examples is the same, for every “assistant” message, I have 3 different ways the “user” can ask it, so this 3x’d my fine-tuning dataset.

As it stands, I have ~2500 examples in my fine-tuning dataset. I am going to use the gpt-3.5-turbo-1106 model initially, but I may try gpt-4-1106-preview later for the same dataset.

Foxalabs · November 25, 2023, 8:03pm

Typically 5%-20% of your dataset is retained for evaluation, although there is an argument to be had that just a few can be used externally for validation by a human and you will get a better training run with your remaining eval dataset used as training data…

arun279 · November 25, 2023, 10:23pm

Thanks for the response. Is there any documentation on how the validation set is used with the fine-tuning API?

My experience of it is with simple classification tasks where there is only one correct answer, but in this case, the generated outputs might still be good even if they did not match the data in my validation set exactly.

Foxalabs · November 25, 2023, 11:05pm

I would start with the OpenAI guide here and then perhaps look at youtube for guides on fine-tuning and evaluation

https://platform.openai.com/docs/guides/fine-tuning/use-a-fine-tuned-model

FlogramMatt · December 11, 2024, 6:46pm

You could always fine tune twice, first using validation data to find ideal number of epochs and what not, then once done, train a second time but this time include the validation data.

_j · December 11, 2024, 9:00pm

If doing so, be aware that OpenAI automatically sets the training weight based on the input file size, even if you are continuing by specifying an existing model.

If your hold-out set is 20%, you may want to manually set the multiplier hyperparameter to 0.2, and use a low epoch count of just 1 or 2, so that the newest training doesn’t dominate.

Topic		Replies	Views
Train/Validation split - fine tuning API	3	1548	February 6, 2025
How are validation files used during fine tune jobs? API fine-tuning , gpt-4o-mini	3	202	August 13, 2024
Where can I find information about validation_file used in fine tuning? API api	1	1341	October 25, 2023
Questions about fine-tuning GPT-3.5-turbo API fine-tuning	1	2127	October 29, 2023
Avoid overfitting during the fine-tuning of gpt-3.5 turbo API gpt-35-turbo , fine-tuning , fine-tuning-problems	4	2900	December 21, 2023

What's the best train/validate split for fine-tuning?

Related topics