How to build a Question and Answer Bot for context greater than 2048 tokens?

I wanted to build a Question and Answer bot for contexts text having greater than 2048 letters, but I am restricted by the 2048 limit.
Is there any way to surpass the limit or any other way to give the long context once and then query the context with questions multiple times?

First, take a quick look at this:

It’s not actually a letter limit - but a “token” limit, and there’s no getting around it unless/until larger models are released in the future. It really is part of the way the completion model works, and not an arbitrary rate-limit, etc.

How to handle that for a Q&A bot is a question I’ve seen asked a lot and there’s a few approaches. I’m assuming you have a body of info > 2048 tokens that contains the info you want to be able to answer Qs about (if you instead mean you need a Q&A experience that considers a chat history longer than 2048 characters - or some other scenario - let me know).

I’ve seen two common approaches so far using just GTP-3, and both involve breaking the larger text down into chunks:

  1. Break the text into chunks, have GPT-3 summarize those chunks to create a smaller overall text, use that text. Fairly simple - but also potentially “lossy” in the summarizing.
  2. For each questions, first go through each chunk with a prompt that asks “is this question likely to be answerable from this text?” Then only pass the “yes” chunks in to consider for the actual question.

In reality, the best case here might actually be to use an embedding an different paid or open source model to quickly identify the right portion of the text to consider, then use that.


Hi, why do you need a long context. What’s the task and how the context looks like? I bet there is a step missing somewhere (a sort of internal reflexion before crafting the final answer)