Fine tuned model providing worse output

I’m just trying a few things on the playground and used davinci to generate some short tweets.

If I give a prompt with some args:

Stop sequence: ###, 6.

Generate a list of 5 tweets.



For the default davinci model, I received 5 tweets as expected. But using a fine-tuned model, which uses davinci as the base model, the output is pretty terrible. I receive an incorrect number of responses and each piece of content doesn’t really make any sense.

For example with the above prompt I received:

Awesome site!:

Nice tools: https://pnpm.8bitf # 1 of 5 

I may be misunderstanding but I had assumed the fine-tuned model would be at least as good as the default davinci model. But it seems to provide much worse output. Albeit I only provided a dataset of around 50 tweets.

Does fine-tuning only provide value once you give at least a few hundred examples in the training data, and is some property of the default model lost when fine-tuning?


Keep in mind that the default model right now is TEXT-DAVINCI-002 which has been finetuned to know how to follow instructions. When you finetune your own model, you are starting with vanilla DAVINCI, which is basically just an unstructured autocomplete engine.

200 is bare minimum. Also you will need to ensure that your finetuning data is well formatted.


Ahh that makes total sense. Thank you!

cleaning and maintaining the fine tunings is the problem and there are no good tools to do this anywhere.

check and make sure that your fine tuning rules do not overlap or conflict with each other to produce different completions.

if they do, get rid of one of them.

then re-uplaod your fine tuning and try again.:sunglasses:

In addition to using a larger dataset, as others have mentioned, it may help to format the fine-tuning data similarly to your prompt above i.e. including “Generate a list of 5 tweets” before the tweet examples.

This answer is exactly what I was searching for. I read also in other threads that it is expensive to train a vanilla davinci, I think that OpenAI should create a way of training their fine-tuned modules, because it would save time and money for their customers, and also the achieved result can be better. In my project I am currently using the model text-davinci-003 with a few examples in the prompt itself, and it is working fine, although it is not the optimal way, because of the repetitive data I am sending it. If at least OpenAi had another variable, like the “model” and a new one “context”, so that we could choose the model text-davinci-003 and set a context once and reuse it, instead of fine-tune from zero, or sending again and again the same examples in the prompt, this would be much better I think.

1 Like

is this format correct?

{"prompt": "What is gallabox?", "completion": "Gallabox is a single, collaborative communication tool, which empowers businesses to help convert their customer conversations into business-oriented actions."}
{"prompt": "Are the Gallabox pricing plans inclusive of tax?", "completion": "No, Gallabox\u2019s subscription rates and any add-on requirements are exclusive of tax."}