Finetuning GPT-3 to write a novel... part 1 (aka AUTOMUSE RIDES AGAIN)



Looks familiar! Hope it fares well!


Just watched the video. The large content returned issue at the end of the video is hard to get around. I used a summarizer the chunks previous to get down to 200 tokens, about, so it was still limited to about ten chapters, then I had to summarize the ten chapters down to a few sentences to move on.

I decided to do this because I think we don’t really read books with that much context from chapter one unless it’s a major event or a character; we tend to read chapters by chapters content only and recall the events previously.

It turned out pretty well with that system.


Awesome! I had some success by writing chunks at a time, for example one scene in an act at a time (with several options to choose from, one is usually quite good)… as stated above. No luck with fine-tuning, at least with the format of “story” data examples we used.

Best luck and regards!


Part 2 - a hot mess. This turned out to be much more work than I thought due to constraints of window sizing, and the fact that The Adventures of Sherlock is not a single story ._.


This is great.
Ive made 2 small changes, inserting the summary for each chunk into the final prompt before the chunk, and then the summary for the next chunk to be written after.

This gives it a better cue of what to do and where to stop when writing.

So now it goes:

Story so far:
Last chunk summary:
Last chunk text:
Next chunk summary:
Then finally
Next chunk text:

It might be worth recosidering all your book choices…
Sherlock is an anthrology, alice in wonderland is all over the place plot-wise… its not a well constructed story. Pride and prejudice is too long, and i wouldnt go for dracula either as its a collection of letters.

Maybe go for something with a clear story for training?


Yeah, this is just proof of concept. For the final set it will probably be at least 10 or 20 different books. Writing a novel is one of the most complex things a single person can do, so I have no expectation that this will even work.

1 Like

Im not really after creating a fully automated system anyway, so everything is just adding to the writer’s toolkit.

From that point of view its all useful.

Currently looking at ways to develop characters, themes, subpots and story plans and use them to get from a premis to a scene or chapter list


That sounds highly doable. Have you had success? Care to share some insights?

Warning - long post… here are my ideas right now. working with the playground:

Ok, I’ve been working on prompts that will create more detailed premises with more emotional depth, and with complete story arcs.

I’ve been working on prompts which begin with characters who have depth and detail. Here’s an initial one that works (feel free to adapt and adopt). Of course, changing the words “quirky” and “literary” for other forms will change the style - i.e. “hardboiled” and “detective” could also be used, or “alien” and “sci-fi” :

Brainstorm a quirky or unusual character who could be the basis of a literary novel.

  1. Name
  2. a two sentence description of the character’s personality, internal contradictions and inner conflict
    3 a paragraph describing what the character looks and sounds like, written in the style and voice of the novel
    4 a history of the character’s life up to now
    5 archetype
    6 quirks and flaws
    7 a description of the character using the big 5 personality traits psychological assessment method


Once we’ve got the character, we can do another prompt to create a story around them (again, switch up the adjectives for other genres):

brainstorm a detailed synopsis for a literary novel, :including final climax of the story and all spoilers. Use the above character as the main protagonist. Use 500 words, about 6 paragraphs:

From here, let’s develop a supporting cast:

Other characters:
Imagine the other characters in the story, their names, descriptions of their looks, personalities and desires. Identify the lead antagonist, love interest, mentor, comic relief, and supporting characters Use 100 words for each character.

Now, consolidating the prompt to include all the above, let’s give it some subplots:

Imagine the subplots that this story might have aside from the main plot:

Now we’ve got some detailed story ideas, it’s time to hammer them into a format… I’ve suggested three ways to do that below. They may need refining. Pick the one you like.

brainstorm a “hero’s journey” breakdown for the above story:

The ordinary world:

Brainstorm a “save the cat” beat sheet for the above story:
Opening image:

Brainstorm a three act structure for the above story:
Act i
Scene 1:

Once you’ve got the breakdown, it’s time to get detailed:

Break this novel down into 20 chapters. Brainstorm chapter synopses for each chapter:
Chapter 1:

Next, we can use the “insert” tool to break each chapter down into scenes:

brainstorm the scenes in chapter 2. There are 3 scenes, each with a beginning, a middle and an end:
Scene 1:

Next we have to break the scenes down into chunks - as seen in your video.
To do this, we’ll need to:
1shorten the prompt - it’s getting out of control
2 increase the level of detail and colour. Chunks often have little to do with the main thrust of the story and concern themselves with granular details. I.e. it’s fine to spend a chunk talking about the barista in a coffee shop the protagonist visits simply because that character illustrates the way the protagonist is feeling at that particular moment. Equally, you might have a chunk dealing with the make and model of a particular gun to illustrate the fact that the protagonist is a military expert.

I’m not sure how we’re going to identify where and when those diversions are appropriate, and how to bring them in.
My instinct is that human tweaking will be needed here.
Of course, training with a wide range of books might just surprisingly solve the problem. I’m not certain though!

What do you think?

1 Like

Part 3 is up and… it actually works better than I expected

Share your results when you’ve got something!

looking good as always.

On my side of things, I’m coming at it from the other direction - starting with developing a sound story, and trying to work that down into a series of prompts for individual scenes and paragraphs.

trying out the character based method I posted above, I’ve found it’s certainly possible to get down to a scene breakdown. I’ve been doing lots of editing of the results to steer the story - and from a writer’s point of view, it’s gratifying that this works well to improve the story.

Your system should be pretty robust with what gets thrown into it as far as background, story so far, etc. is concerned, so I think putting the two together will work in the end.

I’ve also experimented with getting gpt3 to write a film script as an intermediate stage.

Because a screenplay is a very structured, and concise way to tell a story, so that gets the scaffolding of a scene done quickly.

The idea is that you can then take the screenplay and get it to novelise from there, adding description, and literary work.

1 Like

When it comes to creating actual passages of text from prompts (I’m using vanilla text-davinci-002 -so no training)

my current thinking is the prompt needs:

1 characters appearing in a scene:

2The story so far:

3Summary of last paragraph

4Text of last paragraph

5Summary of next paragraph

And it will produce

6the text of next paragraph:

This produces a couple of problems:

Problem: length decay…

You’ll need to feed it a good strong paragraph 1 to give it a good idea of what length to use for the next one.

However, gpt3 is lazy, and responses will tend to get shorter and shorter until we’re just left with a response that’s similar to the summary.

One solution is to specify a length… and cheat.

If your prompt says

Text of last paragraph (use at least 100 words)

But then for the paragraph to be generated, say:

Text of next paragraph (use at least 500 words):

Then gpt3 will know it has to raise its game for the next paragraph.

Problem: voice decay

In the same way, the voice and style of the prose also decays.

If you start off your first paragraph with a very strong character voice, there will likely be less and less of that voice as the writing continues.


I’m tempted to say we have a “tone of voice” prompt in it every time, including a section of prose, but that seems like it’d be expensive.

Another alternate would be to fine-tune with only books in similar styles. Again, I don’t know how well it’d work.

A third method would be to write every (say) 10th paragraph manually to keep the style in check

As a shorthand, I’m trying little cues in the “text of new paragraph” prompt. For example:

(writing style: hard boiled fast moving descriptions with active language)

(writing style: lots of metaphors and descriptions of people, places and feelings)

(writing style: highly detailed descriptive)

This doesn’t work very well, but does have some effect, I think.

Problem: detail of prompts

The “summary of next paragraph” needs to be incredibly tightly written.

It should never use “him, her, they” but always include names.

It should be a very clear set of instructions - far more succinct and well crafted than we’ve been generating automatically up to now. The kind of waffly language that we tend to get from expanding plot summaries won’t cut it.

In creating prose, Gpt3 tends to:

  1. Introduce unwanted deviations from the plot
  2. Not bring in enough extra colour (like character descriptions, atmosphere, etc.)
  3. Run on past the prompt into the next paragraph, scene, or even to the end of the book.

It needs to be both more expansive, and more constrained.

Doing a four-stage rewrite helps with (a) and (c). Reduce the temperature to about .3: put in the prompt

Text of last chunk


Summary of next chunk

Then run the following question (sorry about making this a 4 step re-write, but I can’t find phrasing that works any other way):

Which statements in “text of last chunk” contradict those in “summary of next chunk”?

Once you have the results of this, add them to the prompt and run:

rewrite “text of last chunk” resolving these inconsistencies:

Next do:

Which actions in “text of last chunk” are repeated in “summary of next chunk”?

And then:

Edit “text of last chunk”. delete repeated actions:

(b) is trickier. It’s a very subtle part of the writer’s art to decide and place details. When do you want to describe a character, and when do want to leave them faceless? When is a good time to stop the plot and concentrate on the atmosphere? What kind of observation on the human condition would make sense for this character to arrive at at this moment? How should you trail a small detail that will come up later in the story?

It’s possible there are some formulaic answers to some of these questions, but it’s complicated, and has to do with overall rhythm and style. It’s also where highly original comparisons and interactions come together. Maybe this is the bit you need to be an intelligent, conscious entity to think about….

More than one way to skin a cat…

I think we’re going to need to head in both directions.

We’ll need a robust strategy for developing the complex background material and structures and style cues that a novel needs in order to be coherent, but we’ll also need a good quality trained model for generating the actual text.

And it’s done!!

Note: still some bugs to work out, but it works in principle. I might come back tot his, but y’all are welcome to take it and run with it. See if you can fix the repetition bug.


Nice work, Dave. I appreciate you showing the full process, and not editing out the challenges. The troubleshooting and associated thought process is quite useful.


Fascinating work.

Im goong to post some of my own work soon on gothub and youtube.

I think the solution to your repetition problem is going to be human intervention and massive training sets…

1 Like

Okay I’ve come back to this project with AutoMuse3 with some new ideas!

  1. Rather than a finetuning project, this is more of a cognitive architecture project (prompt engineering).
  2. I use lessons from my “simulation microservice” to run a text-based environment simulation
  3. I use lesson from cognitive architecture to run a very lightweight character model for each character
  4. It runs recursively, summarizing the whole story, adding a scene, character, and plot events
  5. After simulating the setting/character/plot it tries to convert those logs into entertaining prose

It’s pretty rough right now. It tends to really go off the rails. One time, it blew up the coffee shop, killing all the characters. Another time, one of the characters left to go to work and it got stuck in an infinite loop having everyone saying goodbye. So there’s still work to do! But I think I’m onto something here. Just a little bit more sophistication and dynamic control, and it will be able to simulate several characters for an arbitrary length of time.

The prose-writing part is probably going to require a lot of prompt engineering or finetuning. Well, all of this could benefit from finetuning, but it’s off to a great start. Here’s the code:

And here’s a companion video explaining it and demonstrating it:

1 Like