GPT3 finetuning for large text summaries

Hey guys, I just completed my first fine-tuning training and well… it didn’t go quite as I expected. I made a data-set of around 110 prompts and their summarized text and trained it with davinci. I am trying to get an accurate summary of large pieces of text that are more detailed that the tl;dr preset provided in the playground. I spent around $30, but GPT3 seems to be doing worse than the playground. It either paraphrases the given input, repeats it, or starts writing more content, while I just need it to summarize. I even gave it one of the prompts that I trained it with and it still wasn’t able to formulate anything close to a summary… To train the model, I used the following command: openai api fine_tunes.create -t "training_prepared.jsonl" --no_packing -m davinci. Anyone know what could be going on? Did anyone train GPT3 to summarize text and can help me? Thanks a lot for the help in advance!

1 Like

Just to be clear - you are for sure testing on your “new” fine tuned model right? Not the original DaVinci one?

Pretty sure I am. I am using openai api completions.create -m davinci:ft-personal-2021-12-21-21-27-29 -p <PROMPT>

What temperature are you using? A higher temperature than you might usually use can help with a fine-tuned model? And what does a sample input > output look like from your dataset? Feel free to DM me if you prefer!

How would you set the temperature when using a fine tuning model? I didn’t know that was possible

Just read the help menu and I see it. I will mess around with the controls until I can get it to do what I want. The results look better than what they used to be, but still not too great…

How can I dm you a sample training example?

Can you share a sample input? Are you terminating it the same way you are with your training data (ie ‘\n\n###\n\n’)?