Context length error with RetrievalQAWithSourcesChain

Hello, i have a problem, after a few messages with my chat i have an errot:

error_code=context_length_exceeded error_message=“This model’s maximum context length is 8192 tokens. However, your messages resulted in 9066 tokens. Please reduce the length of the messages.” error_param=messages error_type=invalid_request_error message=‘OpenAI API error received’ stream_error=False

my main Chain looks like this:

chain = RetrievalQAWithSourcesChain.from_chain_type(
        llm=llm,
        chain_type="stuff",
        retriever=retriever_prawo, 
        reduce_k_below_max_tokens=True,
        chain_type_kwargs={
        "verbose": True,
        "prompt": prompt,
        "memory": ConversationBufferMemory(
            memory_key='history',
            input_key='question'),
            
        }

but reduce_k_below_max_tokens=True is not helping, also I’ve tried to use later in the code chain.max_tokens_limit = 8000, bu this also not working. Im using ChainLit to make a chat.

How can I prevent my chat from getting this error?

Hi and welcome to the Developer Forum!

Have you looked on the LangChain forums?

At a basic level this is saying that the prompt it too large, so something is either not being managed correctly or perhaps you have misunderstood some aspect of LangChain, happy for your question to remain up here, but for a faster resolution it it probably best to ask there.

Okay understand, I’ll check there.
I think about mayby someone have same problem or maybe know how to “clear” context if its too big.
So this can be problem with prompt ? Because,i need to say i have really big prompt with 10 Examples because i work on many documents and I’m trying to give to the model examples from few of the documents to show him how to response.

Yes, this due to a larger than allowed prompt, you will need to reduce it.

Okay, but this still didn’t help when I have 4/5 Questions and answers then model generates this problem about context, you know mayby a way to clear from context oldest messages ? No from prompt but from this conversation I’m currently having?

I don’t use LangChain I’m afraid, for that one you’d need to ask there.