"validation_file" in Create Fine-Tuning Job

In the create Fine-Tuning job API there is one parameter validation_file (POST https://api.openai.com/v1/fine_tuning/jobs) in this doc display the the below guidance for that

But don’t understand this, please let me know what the difference is between the training_file and validation_file, and also clarify what data should be included in the validation_file and in what format, such as the same format as the training_file (JSONL), and whether I should upload this file in the same way as I did with the training file for fine-tuning.

TLDR: Training file is a file which contains the data using which the fine-tuning job is done and output model is created. The validation file consists of data, yet unseen by the model, on which the performance is evaluated to fairly judge the performance of the model on data it has not encountered before.

Could you please tell me what is the effect if I not pass the validation_file.

Does the validation file improve the created fine-tuned model performance?

Even if you do not pass a validation file, it will effect the model that is generated out. However, you will have to do na extensive manual evaluation, without any bias.

If you pass one, the performance metrics will be returned alongside the final model, so you get a better idea of how the performance was

Does the validation file improve my fine-tuning model performance?

The validation file doesn’t affect the tuning that is done from the training file.

The fine-tune job just produces a report every batch that is performed during the fine-tune learning, so you can see how the trained model also scored on other questions that are like how you want your AI to respond. Or if you want to continue with more tune because the model hadn’t reached the peak performance on similar questions yet.

1 Like

Thanks, it’s helped me a lot.

Please tell me how to update the fine-tuned model training data once created fine-tune model.

When you fine tune normally, you choose the base model that you want to train.

When you want to continue, you specify your model you already trained.

Each will produce a new separate model name.

2 Likes

I tried this way but did not get the god result as a separate model let me explain

I created the prompt for generating the color palette based on the user query
So Consider the color palette for Independence Day. based on this create the 10 training examples and create the fine-tuned model. This model works perfectly on Independence Day but other category color palettes do not work like a festival. so we created the 10 training data for the festival and updated our model.

But in this model not give the proper color about the Independence Day that working fine In previous model.