Hello,
I am attempting to fine-tune some models to have fun with various tasks. I’m a newb here so as you’d expect I found some surprising edges that took me a while to discover. I thought others and Google might benefit if I shared.
Following the Fine Tuning guide I got to the openai tools fine_tunes.prepare_data
command. I quickly threw together a sample dataset to test it out and got the dreaded ERROR in read_any_format validator
error:
❯ openai tools fine_tunes.prepare_data -f ./canadian-weather.jsonl
Analyzing...
ERROR in read_any_format validator: Your file `./canadian-weather.jsonl` does not appear to be in valid JSONL format. Please ensure your file is formatted as a valid JSONL file.
Aborting...%
Clearly I have a jsonl formatting issue and this will be trivial to fix, right? Yup! That’s what I said too, 90 minutes ago. Here’s the contents of my canadian-weather.jsonl
file looks:
{"prompt": "How is the weather today? PROMPT_SEPARATOR", "completion": " I stuck my head out the window and froze my ears off. It's -20 degrees celcius, or -45 with the windchill! What did you expect in a Canadian winter, eh? STOPSTOP"}
I followed all the advice I could find on this forum, including this excellent post from @ruby_coder. I used every JSONL validator I could find. Triple checked the encoding was UTF-8. Tried full file paths. And kept reducing the file further and further until I couldn’t anymore.
None of this resolved the validation issue.
Did you spot the issue yet?
If you guessed the file was too short, you are an excellent debugger! For those that just want to enjoy the show, I tried adding a second line to the file as follows:
{"prompt": "How is the weather today? PROMPT_STOP", "completion": " I stuck my head out the window and froze my ears off. It's -20 degrees celcius, or -45 with the windchill! What did you expect in a Canadian winter, eh?"}
{"prompt": "How's today's weather looking? PROMPT_STOP", "completion": " Whew, I wish you waited until it was warmer before asking. I bundled up in my mittens and touque and darm near didn't make it back alive. It's -25 degrees celcius now, or -50 with the windchill! Even worse than yesterday. We should visit California today, eh?"}
And now openai tools fine_tunes.prepare_data -f ./canadian-weather-multiline.jsonl
was happy and provided some good feedback on this file.
❯ openai tools fine_tunes.prepare_data -f ./canadian-weather-multiline.jsonl
Analyzing...
- Your file contains 2 prompt-completion pairs. In general, we recommend having at least a few hundred examples. We've found that performance tends to linearly increase for every doubling of the number of examples
- More than a third of your `prompt` column/key is uppercase. Uppercase prompts tends to perform worse than a mixture of case encountered in normal language. We recommend to lower case the data if that makes sense in your domain. See https://beta.openai.com/docs/guides/fine-tuning/preparing-your-dataset for more details
- All prompts end with suffix `? PROMPT_STOP`. This suffix seems very long. Consider replacing with a shorter suffix, such as ` ->`
- All prompts start with prefix `How`
- All completions end with suffix `, eh?`
Based on the analysis we will perform the following actions:
- [Recommended] Lowercase all your data in column/key `prompt` [Y/n]:
After making the recommended changes I ended up with this dataset file:
{"prompt": "How is the weather today? ->", "completion": " I stuck my head out the window and froze my ears off. It's -20 degrees celcius, or -45 with the windchill! What did you expect in a Canadian winter, eh?"}
{"prompt": "How's today's weather looking? ->", "completion": " Whew, I wish you waited until it was warmer before asking. I bundled up in my mittens and touque and darm near didn't make it back alive. It's -25 degrees celcius now, or -50 with the windchill! Even worse than yesterday. We should visit California today, eh friend?"}
This passed all the checks so I’ve proceeded to the fine_tunes.create
step for creating my Bob & Doug Mckenzie weather service.
Hopefully this saves others some trouble and provides some additional perspective to the validation team. Starting with a single prompt seemed like a great starting point before scaling things up. It wasn’t a bad idea, but for now at least, start with two entries.