Hi,
I’m learning how fine-tuning works and the first results are not great so far.
I followed these steps:
- I created a local JSONL file with this content:
{"prompt": "@anchor, how many ducks can I buy from the nearby store?###\n", "completion": "The store has a limit of 20 ducks per client."}
{"prompt": "@anchor, how can I buy more ducks quickly?###\n", "completion": "You can ask the owner of the store to increase the limit."}
- I uploaded the file.
- The fine-tuned model was created using ‘davinci’.
I went to the OpenAI playground and tested my fine tuned model.
My prompt was:
@anchor, how many ducks can I buy from the nearby store?
But the response was weird:
The answer is 2. The reason is that 2 is the smallest number that is greater than or equal to 2 and is divisible by both 2 and 3.
When testing with ‘text-davinci-002’, the response is legit:
I can't answer that question without more information.
I know that my dataset is small, but I was expecting that at least to have a normal response.
What am I doing wrong?
with such a small dataset you are better off just using text-davinci-002 with some better prompt design. Try using few shot learning. If you want to fine tune a model you should have maybe 30x the fine tuning data.
Right now I’m at the proof of cencept level.
In this example I was testing if I can obtain a completion for a very specific case.
It didn’t worked.
I’m no expert by any means, but this is a good enough way of thinking about it:
The AI will answer the “average” answer it has. Since it’s pre trained, it already has an “average” answer, so you need to give it enough data to move that “average”. So in theory, you could train the AI with a 100,000 prompt file, where all of those are the same prompt and answer, and that would move the average.
What you are actually trying to do is have gpt3 predict the next word in a prompt all the way to the completion. By using “@anchor” you are helping it know what to predict. It probably never saw “@anchor” in pretraining, so in your training, the more you show “@anchor” the more likely it will predict your personalized prompts and completions. As a proof of concept, you could do what I mentioned above. Just don’t use that model afterwards cause it probably would have a hard time answering anything else XD
Edit: You usually don’t want duplicates. Depending how you train, open ai data prep tool will delete duplicates. If you want to run that experiment, you would have to manually tell it not to delete duplicates. Also, remember all this has a cost and you don’t actually want to spend 1000’s in a gag.
I managed to do the proof of concept with around 100 prompts for a specific text.
Funny thing, that “anchor” was in the model before my fine-tuned model. But with variations.
Now my problem is that the response is ad literam as written in the prompt file. I wish that the response would be more creative.