How to force to continue a truncated completion?

I’m experimenting to generate LONG texts

  • using GPT3 model OpenAI completion create API / through the web playground
  • using ChatGPT

Suppose designing a prompt asking to generate a book, or to generate a screenplay.
There is a commonplace where GPT systems could generate books or movie screenplays, or any document with hundreds of pages.
But in facts I do not understand how it’s possible to lets GPT3 to implement that ‘continue completion’ from a given seed (a synopsis).

By example, I tried to produce with the GPT3 playground a screenplay via the playground, but the generated output is broken (uncompleted) at a certain point.

I’m confused. A reason could be because the max token limit has been reached (4K tokens), but sometimes the output is broken even if the limit is not reached.

Anyway, there is a procedure to let a GPT3 model to continue to generate output following previous context? Reading the API specification I didn’t find any info about.

Please don’t tell me I have to shifting/slicing the prompt window, eliminating old contents from the prompt, cycling/slicing until I achieve the real end of generated content.

Is this the only solution?

BTW, using ChatGPT instead, it seems to me that this last is ‘partially’ able to continue the generation, just saying ‘continue’, but it happens that sentences are truncated/ abandoned.

Any Idea?

Related threads:

So I cant speak to way to continue prompts after the token limit, I believe its an open problem with large language models (though do not take my word for it).

I can speak to some experiments I made using ChatGPT that might shed some light on how it partially extends its token limit though. I started a session with the question, “How big is the sun?” and then, after it answered, I pasted a long research article into the prompt and asked it to summarize that article for me. It successfully did that, which lead me to believe that it read the entire thing (token limit be dammed). This was furthered when I asked it, “what was the first question I asked you?” and it replied with my question about the size of the sun. However, when I asked it, “what was the last question I asked you after the research article?” It had no clue, so I’m inclined to think that it is just taking values from the first ~4,000 tokens of any given chat and then working with that + the last question for its responses, maybe adding on some shifting/slicing/etc but i suspect not.