Does fine-tuning a fine-tuned model (e.g., GPT-3 Babbage) consume the original model, or does it create a copy? I’d like to explore fine-tuning my existing model; however, I don’t want to lose access to said original model.
Thanks!
Does fine-tuning a fine-tuned model (e.g., GPT-3 Babbage) consume the original model, or does it create a copy? I’d like to explore fine-tuning my existing model; however, I don’t want to lose access to said original model.
Thanks!
Here’s what docs say about this:
If you have already fine-tuned a model for your task and now have additional training data that you would like to incorporate, you can continue fine-tuning from the model. This creates a model that has learned from all of the training data without having to re-train from scratch.
I tested it on my end as well and in my experiment it ended up creating a new fine-tuned model, which is ofc the fine-tuned version of the previous fine-tuned model.
Here’s screenshot of the model list. I fine-tuned the model marked in red to and got a new fine-tune (marked in green).
What I’ve observed is that it prioritizes the data used in the 2nd iteration of fine-tuning, and sometimes acts like it has forgotten the data given in the first iteration. In other words, fine-tuning the base model by merging the old and new data will produce different (better) results compared to fine-tuning the already fine-tuned model with new data.