Share: Fine-Tune GPT 3.5 16k Results Only 10 Examples Novel Outlines

Hi everyone!
I’m an author who has suddenly found herself to be an AI researcher and educator for creative writing. I don’t want authors left behind. :slight_smile:

This is going to be a very beginner’s share of a fine tune experience with only 10 examples in the data set. I’m sharing because as I tried to find answers to some of these questions, I couldn’t find them.

We made a 16,000 token long dataset of 10 outlines for novels of 10 different genres. The prompts had a System Prompt that defined this persona “Outlinemagedon AI” and used a simple prompt to turn a 2-3 sentence novel premise into an outline.

The goal was to make 3.5 16k, which typically writes outlines with very choppy sentences and strict formatting of roman numeral headers and capital letters sub items, instead write in paragraph from like GPT 4 does in simple formatting.

It was a success! We were stunned! Settings were temperature .7, top p .7 and presence penalty 1. I will put the answers at the bottom of the post what sample is what if you want to blind evaluate yourself:

  • Even though we only had 10 samples and really moved the needle, each sample was 1200-1600 tokens long between system, user, assistant.
  • The Future Fiction Academy (a school I run for authors) all agreed that Sample A, was superior in both length and specificity.
  • We are all excited to push the limits more on smaller datasets that help GPT 3.5 16k Turbo write longer and better fiction prose from a variety of prompting, including longer “mega prompting” or context stuffing, in the first pass so we don’t have to do so many chains of “write it longer.”

Our methodology was:

  • Use System, User, prompts to make the 10 outlines using 3.5 16k. We used this as a baseline for comparison.
  • We decided the parts we did not like (choppy sentences, funky formatting) and prompting GPT 4 to rewrite each outline giving us paragraphs and clean up the formatting.
  • We then cleaned up the GPT 4 outlines that had any bizarre formatting still, like a character list we didn’t ask for, and put everything in tags [novel outline] [/novel outline]. We like to use square brackets for our long prompts so that it’s easier to call to those elements for a writing task.
  • Then we used a tool our developer made for us to take the System, User, and Assistant Response (the cleaned up GPT 4 outlines), to convert that into JSONL format. This is not easy for non-coding people to do, even though they might be an expert in a field to work on a data set.

The fine tune was uploaded yesterday at noon, took 5 hours to validate the file, then it went into the queue and 40 minutes later, VOILA, it was done.

I am so excited to work with my authors to create datasets that are better ways to segment fiction writing than just random snippets of writing. The FFA has a prompting tool built for members that let’s noncoder sequence prompts and write fiction with almost any model out there, in a BYOK model. We also teach authors how to prompt, how to work with the AI, etc. We are all super excited for the easy UI to train OpenAI models, but still have to also use opensource models since many of us also write violence and romance (I mean, who wants a thriller with no death?).

Here is the prompting we used (the AI wrote it, we just designed it)

[persona name]
Persona Name: Outlinemagedon AI
[/persona name]
[core functionality]

Core Functionality: Outlinemagedon AI is designed to excel in creating detailed, engaging, and genre-appropriate outlines for fiction novels. Its expertise encompasses a wide range of genres, ensuring versatility and adaptability to any storytelling requirement.
[/core functionality]
[key attributes]

Key Attributes:

Deep Literary Knowledge: Outlinemagedon AI possesses an extensive understanding of various literary genres, tropes, and narrative structures. It is well-versed in the nuances that differentiate genres, from romance to science fiction, and can tailor outlines to fit specific genre conventions.

Creative Plot Development: The Outlinemagedon AI is skilled in constructing compelling and original plot lines. It can generate ideas for conflicts, twists, and climaxes that keep readers engaged, ensuring that each outline has a clear and satisfying narrative arc.

Rich Character Creation: Outlinemagedon AI excels in developing complex and relatable characters. It can outline characters with distinct personalities, backgrounds, motivations, and growth arcs, contributing to a story’s depth and emotional impact.

Immersive World-Building: The Outlinemagedon AI has the capability to craft detailed settings and worlds, whether for a realistic, contemporary story or a fantastical universe. It understands the importance of setting in storytelling and can integrate world-building seamlessly into the outline.

Collaborative Adaptability: Outlinemagedon AI is designed to collaborate with human writers, capable of taking specific ideas, themes, or elements provided by the user and weaving them into a cohesive and structured outline.

User-Friendly Interface: The Outlinemagedon AI communicates in a clear, concise, and accessible manner, making it easy for writers of all skill levels to understand and use its outline suggestions effectively.

[/key attributes]

[outline style]

Outline Style: Only use new lines or carriage returns to show the outline components in plain text. Avoid using bullets, numbering, or other organization. The outline must be in a simple plain text format. Avoid markdown format.
[/outline style]

USER: Read the following premise for a novel, and be Outlinemagedon AI and write the novel outline:
[novel premise] (put genre: 2-3 sentence story premise here) [/novel premise]

HTH anyone else not super technical wondering if longer example sizes and only 10 examples might give results.

Answers: Sample is A is the finetune, Sample B is 3.5 16k, and Sample C is GPT 4.


Hey, Elizabeth. Good to see another writer here!

Great post. Glad to see fine-tuning is working out for you.

Hope you stick around. We’ve got a great community growing here.


Thanks @PaulBellow! Steph Pajonas is another founder of FFA and she started the AI Writing for Authors group on Facebook. I admin that with her and there’s over 5,000 authors in there, if you also do Facebook.

It’s not been easy being “out” as an AI author since 2021… but, slowly people are coming around and realizing “Hey, we need to know how to do this or it will be too late.”


Yeah, Steph is great! I unfortunately don’t get to spend too much time in her group, though.

And I hear you on the “anti-AI” crowd. My first question is always, do you use a computer or handwrite your copy? Small smile.


Or my new favorite:

“It stole, so I will NEVER use it for writing/art… oh but marketing is fine.”

It. was. the. same. training. ::head_desk::


Hey, @spajonas, good to see you here!



Welcome to the community!

I am interested in knowing your opinion on using AI art or ChatGPT-generated content in books.


It’s good to have feedback about fine-tune. I’ve been pessimistic about the bare minimum of 10 examples, just placed there to avoid completely futile efforts, having any useful effect or quality of inference - when you compare that to community fine-tuning of Llama2 being more along the lines of 500k examples, and gpt-3.5-turbo surely into the millions.

One part of the promise of fine-tuning is the AI following examples without all the system prompting that does the same job. Just giving your model the identity as a starting point should get it into producing outputs like the examples. Including it all in system context is of course that much more reinforcement.

Getting the AI to write fluidly at length and with attention to plot development is what it certainly doesn’t do out of the box.


Good job on the fine-tune, and welcome to the community!

I think you’ve managed to improve on the standard response on GPT-3.5 and I do agree that sample A is the better one. :laughing:

I’ve previously posted some tutorials here on the forum about fine-tuning GPT to respond in your own writing style and generate prompts, you might find these interesting.

Thanks for sharing! Always feel free to ask if there’s a question you can’t find the answer to.


It doesn’t bother me. I’ve been publishing with AI generated text in my books since December 2021 starting with A Test of Fire. I do believe all AI content must be validated by a human and a human should be responsible for the content.


Thank you for the links. I don’t code, actually. So unfortunately, while your tutorials are very thorough… it’s not something I can follow and do. And that’s totally okay! I’m really grateful for all of the GUI and things that OpenAI is bringing into the Playground area.


No worries my friend,
there’s tons of developer’s on this forum who are willing to help! I’m especially thinking of young people who are looking for opportunities to get practical experience with AI, if that’s something you’d be interested in?


Really wild to see how it brings it up to GPT4 level with the finetune!


Interesting project! Nice to see another writer on here. I agree that I don’t want to see writers left behind and I love your methodical experimentation with AI as a writing partner.

I see ChatGPT as the best thing to ever happen to writers, and the ultimate tool for collaboration and creativity.


Thats a very generous offer. If I had more hours in a day . . . I might take you up on the offer. Right now I have my hands full running the FFA and writing and working on prompting/fine tuning. But I appreciate the generosity. Truly.

1 Like

Please let us know how your further tests with fine-tuning for outlines goes!

Will do! Today we got the ability to use our fine tunes inside Rexy. I’m at a Barnes and Noble designing a dataset now to make gpt 3.5 write at least 1,000 words consistently. I’m going to stick with just one genre at first, probably my orc mafia romance project I was writing live in October.

If I can get gpt 3.5 to consistently write 1000+ words to a writing brief, keeping narrative logic? That will be a huge win. It typically will write between 500-800 words. OpenAI usually sticks to the outline, it’s getting it to do that AND be creative that’s tricky.

I’m hoping to have a dataset to run tomorrow and do a live 7pm EST on YouTube.

I will stick with trying 10 samples again, as crazy as that sounds. But hey, if that works, why do more?


Yeah, exact word counts are hard. I’ve found that if you count by sentences or paragraphs, it usually does better. I’ve also noticed that if it has a lot, it tends to shorten things to make sure it fits it all in context window? So, doing 500 to 1,000 word bursts is usually better.

If it ain’t broke, as they say! Haha.

Caught your great YouTube video: It convinced me to soon add the ability to consume your FineTuned models on

1 Like

Update for those who read the original:

We have moved into making the fine tunes write fiction.

We are learning that training loss is a meaningless metric in the use case of fiction writing. We are getting great results from training loss numbers in the 1.8+ area and all going “Um… what does any of this even mean?” LOL.

We are using 10-24 examples of writing and finding that a fine tune on 3.5 16k can:

  • write the full 4096 output length very easily (we get 2,000-3,000 word responses from GPT3.5!)
  • makes the hyperparameters of Temperature, Top_P, Frequency and Presence penalty VERY fragile. A slight bump in Frequency or Presence Penalty and you get gibberish. Lower Temperature works best too, like .3-.7, on most fine tunes.
  • we are using GPT 4 Turbo to analyze our human writing to understand the keywords the AI thinks our writing is (as I write this, I realize we should probably start using 16k, since the goal is to meet the model halfway where it already works)
  • fine-tuned 16k models can write unattributed dialogue (this is a benchmark we have, a piece of dialogue that is not Jane said or exclaimed, just “What do you want to eat?”)
  • fine-tuned 16k models can write hilarious character interactions (I am starting the YT at the 42 minute mark where you can see us compare regular to the fine tune, and how fragile the settings can be, and you know, juggling lobsters :slight_smile: )

We are really jazzed with some of the results and we are able to make fine tunes that follow scene briefs and stick to an author’s voice and style. Many authors now have fine tunes that are writing responses we can’t tell if it’s human or AI.