Request: continuous finetuning?

asabet · January 9, 2022, 1:45am

It would be helpful if finetuned models can be finetuned further with more data, ie something like

openai api fine_tunes.create -t <data>.jsonl -m <finetuned-model>

Right now, the API only allows passing base models to the finetune endpoint. In cases where a training set is growing over time, it would be redundant to retrain a base model on the entire dataset from scratch, instead of continuing to finetune the model on just the new slice of data. Retraining can get costly and time-consuming (especially for curie and davinci), and create additional load on OpenAI’s servers unnecessarily.

Implementation-wise, it would only require cloning the weights of the finetuned model into the trainer, instead of using the base model’s parameters. I think this feature would be greatly beneficial for API users, is it possible for OpenAI to implement it?

daveshapautomator · January 9, 2022, 12:24pm

Continual finetuning is a critical component of my design of NLCA (natural language cognitive architecture). Continual finetuning will be necessary for AGI to be realized. Even if we cannot technically perform continual finetuning right now, we can at least continually accumulate data and periodically perform finetuning operations. For now, this is my stopgap measure in the pursuit of AGI and a machine that can continually learn.

NSY · January 9, 2022, 1:30pm

Hey, @daveshapautomator Couldn’t we fine-tune by starting from scratch and updating the base JSON file? Is there really a difference?

daveshapautomator · January 9, 2022, 3:17pm

Yes that’s what I mean. By accumulating more data, you can continue to integrate more information. I just imagine that there may be more efficient methods in the future, such as repeatedly fine-tuning one model.

asabet · January 9, 2022, 4:03pm

@daveshapautomator continuous finetuning is just online learning, except in batches. Continuous training is a very basic and common practice in ML production systems.

@NSY yes there is a substantial difference. Take a look at the finetune pricing docs. Suppose you have a db with 1M tokens, and receive a new batch of 100K additional tokens from new data. Retraining davinci would cost 1100 * 0.03 * 4 = $132, rather than 100 * 0.03 * 1 = $3 (finetune on new data for only 1 epoch). That’s a $129 difference.

daveshapautomator · January 9, 2022, 4:34pm

The difference is that online learning isn’t available for GPT-3

asabet · January 9, 2022, 4:52pm

@daveshapautomator read my original post please. This is a feature request lol.

Also changed the title to make things clearer.

claire.gong · January 13, 2022, 7:30am

I have the same need! Hope they will include this in API soon.

davidmitt · October 19, 2022, 5:20am

Is this already implemented? I have noticed it is possible to pass a fine tuned model as the base model, but I am not sure if the behavior is as described here. I have tried to verify this myself, but whenever I try to fine tune an already fine tuned model, this model doesn’t seem to work (I get an error when trying to use it for completions).

sergeliatko · October 19, 2022, 6:55am

Hi, yesterday I fine-tuned a pre-trained model with additional data (passed its name as a base model parameter value) which resulted in a new model. Looks like it is implemented as expected. And yes there is a huge improvement is completion results.

Topic		Replies	Views
Continue fine-tuning from a fine-tuned model Announcements	15	7371	February 16, 2024
Finetuning with New Data on Existing Finetuned Model API	6	2495	February 28, 2024
Continuously fine-tuning a model with more data over time API	4	1658	May 21, 2023
Fine-tuning a fine-tuned model does not mean extanding the previous dataset (prompt and complexion examples) API	10	2674	December 23, 2023
Fine-tuning Announcement & Davinci Beta Announcements	7	1805	December 20, 2021

Request: continuous finetuning?

Related topics