Error 400: Maximum context length exceeded

Hi!

I am currently working on building a (RAG) QnA chatbot, using OpenAI API and LangChain.

I am using gpt-35-turbo and a document are loaded from a json file.
Due to the fact that I need chat history, I use BaseChatMessageHistory and RunnableWithMessageHistory LangChain packages in the chain.
I also use RecursiveCharacterTextSplitter and chunk_size=1000 for the document, if that matters.

After asking usually 4 or 5 prompts I receive the following error:

openai.BadRequestError: Error code: 400 - {'error': {'message': "This model's maximum context length is 4096 tokens. However, your messages resulted in 4239 tokens. Please reduce the length of the messages.", 'type': 'invalid_request_error', 'param': 'messages', 'code': 'context_length_exceeded'}}

Is there any way dealing with this error and having longer conversations, given the max context length of 4096 tokens ?

Any advice or assistance would be greatly appreciated.

Thank you!

You can check token sent by using tiktoken and then subtract that from 4096 and set the remaining to the max token parameter.

Thank you for your reply!

I used tiktoken, but the counter value seems to be around 1k tokens when I receive this error. Do you may have any idea why?
Also, the method you suggested prevents from getting the error, but I will not be able to have a longer conversation, right?