Hi!
I am currently working on building a (RAG) QnA chatbot, using OpenAI API and LangChain.
I am using gpt-35-turbo and a document are loaded from a json file.
Due to the fact that I need chat history, I use BaseChatMessageHistory and RunnableWithMessageHistory LangChain packages in the chain.
I also use RecursiveCharacterTextSplitter and chunk_size=1000 for the document, if that matters.
After asking usually 4 or 5 prompts I receive the following error:
openai.BadRequestError: Error code: 400 - {'error': {'message': "This model's maximum context length is 4096 tokens. However, your messages resulted in 4239 tokens. Please reduce the length of the messages.", 'type': 'invalid_request_error', 'param': 'messages', 'code': 'context_length_exceeded'}}
Is there any way dealing with this error and having longer conversations, given the max context length of 4096 tokens ?
Any advice or assistance would be greatly appreciated.
Thank you!