Davinci not learning new patterns after fine-tuning, forgetting to answer questions

ben.basseri · May 14, 2023, 12:31am

I’m trying to fine-tune davinci to write multiple choice questions following a certain structure, tone and style. But after fine-tuning on the training data, the model fails to consistently generate new multiple choice questions matching even the basic structure of the training samples: a problem statement and 5 response options.

The dataset is 400 samples consisting of

"prompt": "Write a multiple choice question about ... \n\n###\n\n", 
"completion": " <full problem statement> ###"

I fine-tuned for 2 epochs, no learning rate specified. I have tried fine-tuning with learning rates between 0.2 and 1, and all models fail to write questions following the structure in the training data.

The model also seems to have catastrophically forgotten how to answer questions. For instance, if you ask text-davinci-003 “What is the capital of France?” you typically get the completion Paris.. However, the fine-tuned models complete:

What is the capital of France?

What is the capital of France?

or something similarly repetitive.

I have experimented with different temperatures, max_tokens, and frequency/presence penalties with no improvement. Are there any best practices for fine tuning the model so it learns to complete following its training examples when prompted, without catastrophically forgetting other things it used to be good at? I have already read through the entirety of the documentation here. Thanks!

ben.basseri · May 14, 2023, 12:47am

Hi elmstedt! Yes all the training completions have the same basic structure of question statement followed by 5 response options, and all prompts and completions in the dataset have been preprocessed with separators like in the example, and I do use that separator when asking the fine-tuned model for completions. I can try more epochs though!

kevin6 · May 14, 2023, 12:48am

elmstedt is right, 400 is not enough, try more than 750

This link also might be helpful > Fine-tuning a Classifier to Improve Truthfulness | OpenAI Help Center

ben.basseri · May 14, 2023, 1:28am

Great minds! I’m trying GPT4 to do data augmentation and it’s looking pretty good

kevin6 · May 14, 2023, 1:50am

So if you using synthetic GPT-4 generated sample, you may also need to consider this, Aligning language models to follow instructions , what you are doing is supervised fine-tuning

ben.basseri · May 27, 2023, 7:08pm

After checking the documentation again I noticed that the models currently available for fine-tuning are pre-InstructGPT models. So it wasn’t catastrophically forgetting QA, the models had never learned it! Thanks for all the help.

Topic		Replies	Views
Test training Davinci and completion after training API	3	757	December 20, 2023
Struggling with poor performance on fine-tuned davinci model API	15	2645	December 20, 2023
Fine-tuned model unable to answer prompts from training data API fine-tuning , davinci	10	1901	December 24, 2023
Fine Tuned Chatbot forgets how to output summary of conversation API	9	1829	December 18, 2023
Fine-tuning quality davinci vs text-davinci-003 API	1	860	February 3, 2023

Davinci not learning new patterns after fine-tuning, forgetting to answer questions

Related topics