How to handle large context token limits?

s.marrinan · November 3, 2023, 8:46pm

Hi fellow enthusiasts.

What is the recommended approach if wishing to using large context files e.g. 30k tokens being passed into a prompt on the API?

Best,

Shaun

jwatte · November 3, 2023, 9:46pm

You’ll have to compress the prompt by sending smaller pieces and asking the model to summarize each piece, before you send the total of the summarized pieces in for final inference.

s.marrinan · November 3, 2023, 10:02pm

Ah nice solution, thanks, I’ll give that a shot.

mattwcpp · November 5, 2023, 2:47am

It might be worth checking out Sparse Priming Representation. You’d be able to instruct GPT with system prompts to effectively compress the data you want to give it. I think it’s a lot better than simply telling GPT to summarize the content you give it, but there’s still at least some details that ultimately get lost so it’d still make sense to be selective on what to compress compared to what should remain explicit to make the most of the token limits.

s.marrinan · November 5, 2023, 7:59am

Thanks Matt, and thanks for sharing the link. I’m currently looking passing in the standard “you are a helpful chatbot message”, and I think I’ll get a better outcome trying these ideas

vb · November 5, 2023, 9:45am

Hi!
There are good suggestions on this thread already and on a side note the SPR approach is quite equivalent to what @jwatte suggested, just without the fancy wording.

Otherwise you can definitely look into RAG as it is currently the standard way to provide larger context to the model.
Without knowing the specifics of your use case it is hard to tell though. For example if you need several answers building on top of each other based on the context or need to extract information from the context.

s.marrinan · November 5, 2023, 1:32pm

Thank you VB. I’ll take a look at RAG (& Vector databases?), I’ve head a bit about them but I don’t have any real knowledge of them yet. My use case is I want an LLM to be able to answer a large volume of questions based on a large body of context, say 100 questions of varying topics and depth of answers required, form about 30k words of relevant context.

vb · November 5, 2023, 1:39pm

Here, this should get you a good entry point.
Look for the examples.

s.marrinan · November 6, 2023, 9:47pm

Today’s update with GPT-4 Turbo has solved my query! The only thing now is to watch the cost of my queries as these will certainly rack up if I’m sending large token sizes through regularly.

EricGT · December 25, 2023, 6:56pm

@s.marrinan

Is it OK if a moderator closes this topic?

s.marrinan · December 25, 2023, 10:16pm

Yes indeed, thank you for reminding me!

Topic		Replies	Views
Use file with text-davinci-001 to increase tokens in prompt Prompting	13	2612	December 15, 2023
Token limits on prompting Prompting plugin-development	4	2513	June 16, 2023
CLOSED Separate ChatCompletion API calls for 'system' and 'user' API	19	3618	September 20, 2023
Chained Prompt to complete text larger than 4000 tokens? API	14	6259	December 25, 2023
A conversation using the API API	6	2898	December 16, 2023

How to handle large context token limits?

Related topics