Chat GPT-3 Producing Incomplete Summaries and Stopping Early in Text Processing

Hi everyone,

I’ve been using Chat GPT to generate summaries of long texts, but I’ve noticed that the summaries are often incomplete.

Upon further investigation, I discovered that Chat GPT is not processing the entire text, but rather stopping early on in the document. To verify this, I asked it to provide the first and last sentences of the text and to count the total number of words.

The results were surprising, as the last sentence it provided was actually from the beginning of the text, and the word count was significantly lower than the actual number of words in the document.

Does anyone have any ideas on how to resolve this issue? Is this a common problem with Chat GPT-3, or a unique occurrence?

Thanks in advance for your insights and assistance.

1 Like

Im having the exact same problem. Some time it works just rewriting it but right now it just keeps stopping in the middle of a sentence over and over when doing a relatively long text

2 Likes

The max_tokens setting may be your problem

But it is more likely that you are asking a question of a model that can handle 4096 tokens - but your prompt is quite long. When the AI tries to respond, it can only use the tokens you have left over

So if you ask a question and provide text that the answer should be based on in your prompt that consumes 3000 tokens (lets say), then the AI has to fit its entire response in the remaining 1096

If you take this to the extreme and provide 4000 tokens in your prompt, the AI has only 96 tokens.

Of course this is a lot worse if you are using one of the models with 2048 token limits

Edit: ChatGPT (not gpt3) appears to be based on a model of 4096 tokens (derived from this post)

2 Likes

Hello, thank you for your very clear response. However, I have several questions that come to mind:

  1. Does this mean that ChatGPT is unable to summarize a text of more than 4096 words? (excess words will not be taken into account?)
  2. What is the best strategy for summarizing a long text using ChatGPT (e.g., 20000 words)?
  3. Why is ChatGPT limited to 4096 tokens? Isn’t GPT3 much more capable?

Thank you for your help,

  1. Correct, at this time 4096 tokens is the limit on the ChatGPT model
  2. The usual recommendation I see is summarizing chunks, which isn’t the best way…
  3. It’s a matter of compute / processing power and serving the model to millions of people on a daily basis. I’m confident that over time, the token limit window will be opened…

Hope this helps!

3 Likes

Thank you for the very interesting response :+1:. Last questions:

I understood that if we input for example 4067 words (so tokens), there will be 1 token left in the output to generate the summary.

  1. Is this correct?
  2. Does a summary generated with 1 token in the output have less value than a summary generated with 500 tokens in the output?
  3. What would be the ideal number of tokens to keep in the output for a good summary?

Thank you for your responses.

I usually say “continue” to the ai and it finishes

3 Likes

Answers:

1 - yes this is correct (but in your example, you used 4067, so there will be 29 tokens left)

2 - At most, a single token can only represent a single word. But on average, a single token is roughly four characters of english writing. In some cases it is not enough to represent an entire word. (Samatha requires 3 tokens, John requires 1, yes requires 1, and no requires 1)

So a single word summary will probably be no good to you - unless you want a “yes/no” answer.

3 - This is very subjective. But, if it was me, I might input 2500 tokens and expect a summary of 700. But you should probably play around with the values to see what works for you (some text will be harder to summarize than others)

2 Likes

I ask it (him?) “keep going from (last part that I got)” and then it just keeps on going.
well… most of the times anyway

1 Like

So if I understand you correctly, the tokens limitation applies to both the question and the reply? So if I write/paste text that amounts to 2000 tokens, the AI will only have 2096 tokens left for its reply, if that isn’t enough, that’s when it stops and I’d have to write continue?

I thought the tokens limit only applied to the answer.

The playground confuses the issue. The max tokens in the playground is for the response only. But when you do it in code, it is the prompt+completion

2 Likes

Asking bot to continue has helped, thanks!

“continue” won’t work in all cases, it will start its response from the beginning every time for the subsequent prompts. Any other working prompts to tell it to continue from where it is stopped…?

1 Like

Currently i’m testing for NER and RE nlp process but i also facing same problem. i guess chatGPT can’t work well with long text.

The process which was given by me several times left incomplete by the bot.

this should be fixed. Most of people using for to get help for coding or writing content.

1 Like

Hope this will get solved with GPT4-8k & 32k. Still waiting on that.

But it’s a nightmare when needing to copy and paste one string

It’s my one beef with it along with it not remembering things right.

You can try “continue from… {include a line or two here}” … or "continue and start from … { something }