Error code: 400: Max token length

elfa · April 13, 2024, 4:13am

Hi! I am trying to build a chatbot with langchain and openAI. I am new to coding so it is very much trial-and-error.

Yesterday my code was running perfectly fine in my collab notebook. Now i get an error saying i reached the maximum content length.

The error comes when running:

from langchain.chains.combine_documents import create_stuff_documents_chain

question_answering_prompt = ChatPromptTemplate.from_messages(
[
(
“system”,
“Answer the user’s questions based on the below context:\n\n{context}”,
),
MessagesPlaceholder(variable_name=“messages”),
]
)

document_chain = create_stuff_documents_chain(model, question_answering_prompt)

from langchain.memory import ChatMessageHistory

chat_history = ChatMessageHistory()

chat_history.add_user_message(“Hvad er Salling?”)

document_chain.invoke(
{
“messages”:chat_history.messages,
“context”: texts,
}
)

The question I am asking is very simple and should not require a long answer.

I get the same error across models. I am suspecting it is because this morning i re-ran the entire code very fast in my collab notebook and maybe this overwhels the model? But as I am new to this it could be many things i guess

This is my error message:
BadRequestError: Error code: 400 - {‘error’: {‘message’: “This model’s maximum context length is 8192 tokens. However, you requested 9850 tokens (1750 in the messages, 8100 in the completion). Please reduce the length of the messages or completion.”, ‘type’: ‘invalid_request_error’, ‘param’: ‘messages’, ‘code’: ‘context_length_exceeded’}}

_j · April 13, 2024, 4:50am

The API’s max_tokens setting is not supposed to match the context length of the model.

It is for setting the size reservation of the response.

Set it to 2000, as that is more than the AI is ever likely to write, leaving 6192 tokens for input.

elfa · April 13, 2024, 5:00am

But right now I have not specified number of max_tokens. If i try setting it to fx 2000, there is no difference…

Tried changing the model til gpt 35-16k, but now it is saying the message results in 116313 tokens…

BadRequestError: Error code: 400 - {‘error’: {‘message’: “This model’s maximum context length is 16384 tokens. However, your messages resulted in 116313 tokens. Please reduce the length of the messages.”, ‘type’: ‘invalid_request_error’, ‘param’: ‘messages’, ‘code’: ‘context_length_exceeded’}}

elfa · April 13, 2024, 5:02am

Really do not understand why the message results in SO many tokens. The confusion is large here

_j · April 13, 2024, 6:31am

This line means something has sent the max_tokens parameter with a value of 8100.

This means you have sent far more input than the model can handle.

You need to step far back from langchain. If the calls did go through, you would be spending an excessive amount of credits without knowing exactly how input or output is being employed.

peter.flickinger · April 24, 2024, 7:26pm

Hi adding a reply since I had a similar issue, and it took some digging into. Limiting the number of docs the retriever got helped me fix it by passing a top_n param.

retriever = db.as_retriever(max_tokens_limit=10000, top_n=5)

Topic		Replies	Views
Error 400: Maximum context length exceeded Bugs	2	1342	September 11, 2024
How to Handle Token Limit Exceeded Error in OpenAI API API	0	562	December 30, 2024
Not enough tokens error, even though I've paid A LOT (maximum context length error) API api	5	5821	September 9, 2023
Maximum content length exceeded despite prompt being very simple Bugs	7	787	January 2, 2025
How can the OpenAI model's max token length error be resolved? API prompt	6	3646	July 4, 2023

Error code: 400: Max token length

Related topics