Error code: 400: Max token length

Hi! I am trying to build a chatbot with langchain and openAI. I am new to coding so it is very much trial-and-error.

Yesterday my code was running perfectly fine in my collab notebook. Now i get an error saying i reached the maximum content length.

The error comes when running:

from langchain.chains.combine_documents import create_stuff_documents_chain

question_answering_prompt = ChatPromptTemplate.from_messages(
[
(
“system”,
“Answer the user’s questions based on the below context:\n\n{context}”,
),
MessagesPlaceholder(variable_name=“messages”),
]
)

document_chain = create_stuff_documents_chain(model, question_answering_prompt)

from langchain.memory import ChatMessageHistory

chat_history = ChatMessageHistory()

chat_history.add_user_message(“Hvad er Salling?”)

document_chain.invoke(
{
“messages”:chat_history.messages,
“context”: texts,
}
)

The question I am asking is very simple and should not require a long answer.

I get the same error across models. I am suspecting it is because this morning i re-ran the entire code very fast in my collab notebook and maybe this overwhels the model? But as I am new to this it could be many things i guess :slight_smile:

This is my error message:
BadRequestError: Error code: 400 - {‘error’: {‘message’: “This model’s maximum context length is 8192 tokens. However, you requested 9850 tokens (1750 in the messages, 8100 in the completion). Please reduce the length of the messages or completion.”, ‘type’: ‘invalid_request_error’, ‘param’: ‘messages’, ‘code’: ‘context_length_exceeded’}}

The API’s max_tokens setting is not supposed to match the context length of the model.

It is for setting the size reservation of the response.

Set it to 2000, as that is more than the AI is ever likely to write, leaving 6192 tokens for input.

But right now I have not specified number of max_tokens. If i try setting it to fx 2000, there is no difference…

Tried changing the model til gpt 35-16k, but now it is saying the message results in 116313 tokens…

BadRequestError: Error code: 400 - {‘error’: {‘message’: “This model’s maximum context length is 16384 tokens. However, your messages resulted in 116313 tokens. Please reduce the length of the messages.”, ‘type’: ‘invalid_request_error’, ‘param’: ‘messages’, ‘code’: ‘context_length_exceeded’}}

Really do not understand why the message results in SO many tokens. The confusion is large here :smiley:

This line means something has sent the max_tokens parameter with a value of 8100.


This means you have sent far more input than the model can handle.

You need to step far back from langchain. If the calls did go through, you would be spending an excessive amount of credits without knowing exactly how input or output is being employed.

2 Likes

Hi adding a reply since I had a similar issue, and it took some digging into. Limiting the number of docs the retriever got helped me fix it by passing a top_n param.

retriever = db.as_retriever(max_tokens_limit=10000, top_n=5)