Error: This model's maximum context length is X tokens

tavy88 · August 24, 2023, 7:05pm

Hey guys,

So I’ve recently started having this error:

“Error: This model’s maximum context length is 4097 tokens. However, your messages resulted in 5355 tokens. Please reduce the length of the messages.”

The prompt is very small. I’m using LLamaIndex. I am using OpenAIEmbeddings and chunk_size=1000. I am using a PDF which gets indexed without an issue. The problem is that I am asking a random question: “Describe to me what Ruby on Rails is” and I get this error.

However, your messages resulted in 5355 tokens. Please reduce the length of the messages." which seems odd. Is it because the completion is too big?

udm17 · August 25, 2023, 6:32am

Maximum context length is a sum of the inputs that you send the API (inlcuding samples, prompt)
as well as the output that GPT generates and returns. Most likely in your case, it’s a completion which is too big that is causing the problem.

sps · August 25, 2023, 6:46am

Hi @tavy88

AFAIK the chunk size only limits every chunk at 1000. There could be multiple chunks depending on your implementation.

This error could also occur if the conversation has previous messages as they will also consume the context length.

tavy88 · August 25, 2023, 9:13am

Yeah, that was my suspicion as well. I guess turbo-3.5-16k would be better in this case?

udm17 · September 1, 2023, 10:57am

If your prompt is actually small and a majority of the tokens are from the output generation, then yes.

The problem with having larger inputs for an LLM is that it has a tendency to hallucinate/ wrongly use information from the middle of a huge prompt (I’m taking about something close to a 700-1000 token input prompt from experience). If that is the case, it would be better to try and reduce the input prompt size.

Foxalabs · September 1, 2023, 12:52pm

To add to @udm17’s comment, large prompts are fine, if that prompt is full of contextual data, i.e., information to base the instruction upon. What it’s not good for is hundreds and hundreds of instructions one after the other.

So, data is fine, but keep the request per prompt to a few at most, talking 3 or 4 tops, 5 or more and you will start to see significant degradation in performance.

Topic		Replies	Views
Token Limitization Error when prompting Prompting chatgpt , api	8	1265	December 6, 2023
Maximum Context Length Error across different models API	3	1685	December 4, 2023
This model's maximum context length is 8191 tokens, Even when using gpt-3.5-16k API	5	4656	July 3, 2023
Help Needed: Tackling Context Length Limits in OpenAI Models Community gpt-4 , chatgpt , token , rate-limit , openai	8	3832	February 8, 2024
How can the OpenAI model's max token length error be resolved? API prompt	6	2570	July 4, 2023

Error: This model's maximum context length is X tokens

Related Topics