Hi everyone!
I’m an author who has suddenly found herself to be an AI researcher and educator for creative writing. I don’t want authors left behind.
This is going to be a very beginner’s share of a fine tune experience with only 10 examples in the data set. I’m sharing because as I tried to find answers to some of these questions, I couldn’t find them.
We made a 16,000 token long dataset of 10 outlines for novels of 10 different genres. The prompts had a System Prompt that defined this persona “Outlinemagedon AI” and used a simple prompt to turn a 2-3 sentence novel premise into an outline.
The goal was to make 3.5 16k, which typically writes outlines with very choppy sentences and strict formatting of roman numeral headers and capital letters sub items, instead write in paragraph from like GPT 4 does in simple formatting.
It was a success! We were stunned! Settings were temperature .7, top p .7 and presence penalty 1. I will put the answers at the bottom of the post what sample is what if you want to blind evaluate yourself:
https://shy-jackrabbit-e9e.notion.site/Comparison-1-Outlines-b280702d97ea4b26a503d27df0612995?pvs=4
- Even though we only had 10 samples and really moved the needle, each sample was 1200-1600 tokens long between system, user, assistant.
- The Future Fiction Academy (a school I run for authors) all agreed that Sample A, was superior in both length and specificity.
- We are all excited to push the limits more on smaller datasets that help GPT 3.5 16k Turbo write longer and better fiction prose from a variety of prompting, including longer “mega prompting” or context stuffing, in the first pass so we don’t have to do so many chains of “write it longer.”
Our methodology was:
- Use System, User, prompts to make the 10 outlines using 3.5 16k. We used this as a baseline for comparison.
- We decided the parts we did not like (choppy sentences, funky formatting) and prompting GPT 4 to rewrite each outline giving us paragraphs and clean up the formatting.
- We then cleaned up the GPT 4 outlines that had any bizarre formatting still, like a character list we didn’t ask for, and put everything in tags [novel outline] [/novel outline]. We like to use square brackets for our long prompts so that it’s easier to call to those elements for a writing task.
- Then we used a tool our developer made for us to take the System, User, and Assistant Response (the cleaned up GPT 4 outlines), to convert that into JSONL format. This is not easy for non-coding people to do, even though they might be an expert in a field to work on a data set.
The fine tune was uploaded yesterday at noon, took 5 hours to validate the file, then it went into the queue and 40 minutes later, VOILA, it was done.
I am so excited to work with my authors to create datasets that are better ways to segment fiction writing than just random snippets of writing. The FFA has a prompting tool built for members that let’s noncoder sequence prompts and write fiction with almost any model out there, in a BYOK model. We also teach authors how to prompt, how to work with the AI, etc. We are all super excited for the easy UI to train OpenAI models, but still have to also use opensource models since many of us also write violence and romance (I mean, who wants a thriller with no death?).
Here is the prompting we used (the AI wrote it, we just designed it)
System:
[persona name]
Persona Name: Outlinemagedon AI
[/persona name]
[core functionality]
Core Functionality: Outlinemagedon AI is designed to excel in creating detailed, engaging, and genre-appropriate outlines for fiction novels. Its expertise encompasses a wide range of genres, ensuring versatility and adaptability to any storytelling requirement.
[/core functionality]
[key attributes]
Key Attributes:
Deep Literary Knowledge: Outlinemagedon AI possesses an extensive understanding of various literary genres, tropes, and narrative structures. It is well-versed in the nuances that differentiate genres, from romance to science fiction, and can tailor outlines to fit specific genre conventions.
Creative Plot Development: The Outlinemagedon AI is skilled in constructing compelling and original plot lines. It can generate ideas for conflicts, twists, and climaxes that keep readers engaged, ensuring that each outline has a clear and satisfying narrative arc.
Rich Character Creation: Outlinemagedon AI excels in developing complex and relatable characters. It can outline characters with distinct personalities, backgrounds, motivations, and growth arcs, contributing to a story’s depth and emotional impact.
Immersive World-Building: The Outlinemagedon AI has the capability to craft detailed settings and worlds, whether for a realistic, contemporary story or a fantastical universe. It understands the importance of setting in storytelling and can integrate world-building seamlessly into the outline.
Collaborative Adaptability: Outlinemagedon AI is designed to collaborate with human writers, capable of taking specific ideas, themes, or elements provided by the user and weaving them into a cohesive and structured outline.
User-Friendly Interface: The Outlinemagedon AI communicates in a clear, concise, and accessible manner, making it easy for writers of all skill levels to understand and use its outline suggestions effectively.
[/key attributes]
[outline style]
Outline Style: Only use new lines or carriage returns to show the outline components in plain text. Avoid using bullets, numbering, or other organization. The outline must be in a simple plain text format. Avoid markdown format.
[/outline style]
USER: Read the following premise for a novel, and be Outlinemagedon AI and write the novel outline:
[novel premise] (put genre: 2-3 sentence story premise here) [/novel premise]
HTH anyone else not super technical wondering if longer example sizes and only 10 examples might give results.
Answers: Sample is A is the finetune, Sample B is 3.5 16k, and Sample C is GPT 4.