Hey all.
I have a dataset consisting of short children’s stories of a certain reading level.
The stories look like this:
Wake up May.
It’s Monday
It’s a school day.
Come and eat breakfast.
Good morning, Mom.
What time is it?
It’s seven o’clock.
etc.
I want to fine-tune a model to be the best it can be in generating stories for this level.
How would I best prepare the dataset? Any thoughts?
Thank you!
Some common use cases where fine-tuning can improve results:
Setting the style, tone, format, or other qualitative aspects
Improving reliability at producing a desired output
Correcting failures to follow complex prompts
Handling many edge cases in specific ways
Performing a new skill or task that’s hard to articulate in a prompt
One high-level way to think about these cases is when it’s easier to “show, not tell”. In the sections to come, we will explore how to set up data for fine-tuning and various examples where fine-tuning improves the performance over the baseline model.
Another scenario where fine-tuning is effective is in reducing costs and / or latency, by replacing GPT-4 or by utilizing shorter prompts, without sacrificing quality. If you can achieve good results with GPT-4, you can often reach similar quality with a fine-tuned gpt-3.5-turbo model by fine-tuning on the GPT-4 completions, possibly with a shortened instruction prompt.
If you want to go with prompt engineering, here’s a playground example to get you started.
I understand. I was wondering if someone has experience fine-tuning for story generation. I am particularly curious about if the prompt should be the entire prompt I usually use to create a certain story OR is there a way to use only parts.
For instance
I can have a instruction like (shortened…)
PREAMBLE: You are a ...
Write a short story using this and that formatting
INSTRUCTIONS
What follows are Oxford Reading Guide writing instructions to adhere to.
-- Common sentence structure patterns used:
Subject-verb-object (e.g. They got help; I like jam tarts!).
Fronted adverbials (e.g. “Quick, grab my hand!” he said; In the ring, jugglers do tricks with clubs and torches.).
Subject-verb-adverbial (e.g. He looked in the box; It feeds on krill.).
Now would it make sense to take the generic PREAMBLE and a single instruction as the fine-tuned prompt and a desired completion (part of the story)?
Or
Do I need to take the entire prompt (plus 20 instructions sometimes) and its correct completion?