Fine tuned model providing worse output

warrenjday · October 3, 2022, 10:22pm

I’m just trying a few things on the playground and used davinci to generate some short tweets.

If I give a prompt with some args:

Stop sequence: ###, 6.

Generate a list of 5 tweets.

Output:

###

For the default davinci model, I received 5 tweets as expected. But using a fine-tuned model, which uses davinci as the base model, the output is pretty terrible. I receive an incorrect number of responses and each piece of content doesn’t really make any sense.

For example with the above prompt I received:

Awesome site!: http://github.com

Nice tools: https://pnpm.8bitf xed.com # 1 of 5

I may be misunderstanding but I had assumed the fine-tuned model would be at least as good as the default davinci model. But it seems to provide much worse output. Albeit I only provided a dataset of around 50 tweets.

Does fine-tuning only provide value once you give at least a few hundred examples in the training data, and is some property of the default model lost when fine-tuning?

daveshapautomator · October 4, 2022, 9:03pm

Keep in mind that the default model right now is TEXT-DAVINCI-002 which has been finetuned to know how to follow instructions. When you finetune your own model, you are starting with vanilla DAVINCI, which is basically just an unstructured autocomplete engine.

200 is bare minimum. Also you will need to ensure that your finetuning data is well formatted.

warrenjday · October 4, 2022, 9:35pm

Ahh that makes total sense. Thank you!

silvacarl · October 6, 2022, 3:18pm

cleaning and maintaining the fine tunings is the problem and there are no good tools to do this anywhere.

check and make sure that your fine tuning rules do not overlap or conflict with each other to produce different completions.

if they do, get rid of one of them.

then re-uplaod your fine tuning and try again.

dean · October 7, 2022, 4:02am

In addition to using a larger dataset, as others have mentioned, it may help to format the fine-tuning data similarly to your prompt above i.e. including “Generate a list of 5 tweets” before the tweet examples.

inovaproduto1 · February 21, 2023, 3:53pm

This answer is exactly what I was searching for. I read also in other threads that it is expensive to train a vanilla davinci, I think that OpenAI should create a way of training their fine-tuned modules, because it would save time and money for their customers, and also the achieved result can be better. In my project I am currently using the model text-davinci-003 with a few examples in the prompt itself, and it is working fine, although it is not the optimal way, because of the repetitive data I am sending it. If at least OpenAi had another variable, like the “model” and a new one “context”, so that we could choose the model text-davinci-003 and set a context once and reuse it, instead of fine-tune from zero, or sending again and again the same examples in the prompt, this would be much better I think.

manoj.ravichandran · March 7, 2023, 11:19am

is this format correct?

{"prompt": "What is gallabox?", "completion": "Gallabox is a single, collaborative communication tool, which empowers businesses to help convert their customer conversations into business-oriented actions."}
{"prompt": "Are the Gallabox pricing plans inclusive of tax?", "completion": "No, Gallabox\u2019s subscription rates and any add-on requirements are exclusive of tax."}

Topic		Replies	Views
Struggling with poor performance on fine-tuned davinci model API	15	2724	December 20, 2023
Should prompts be unique for fine-tuning? Prompting	9	1771	December 25, 2023
Building the first fine-tuned model API	5	998	December 27, 2023
Fine-tuning quality davinci vs text-davinci-003 API	1	875	February 3, 2023
Got awful results after fine-tuning API	11	3257	December 1, 2022

Fine tuned model providing worse output

Related topics