How to have a longer context window?

I run into an issue where the 128k context window is not enough for my use case since the input is very text-heavy. So I run into max token error. Can anyone suggest how to solve this?
Does OpenAI plan on increasing context window?

Well, if you believe google, they claim they have a 100 bazillion token context window.

But in reality it looks more like they use some built in retrieval augmented generation thing.

It’s a pretty commonly used pattern when dealing with a lot of text.

You can either build it yourself with embeddings: https://platform.openai.com/docs/guides/embeddings

or, if you don’t need as much control and you just wanna try things out, you can try to use the assistant framework’s (or custom gpts’) file tool, just to see if that’s enough for your use-case.

2 Likes

What specifically are you trying to accomplish?

1 Like

I’m trying to parse thousands of reviews to quickly understand what users liked & didn’t like. If the number of reviews is more, I run into the aforementioned error.

The quality of analysis of information in context will suffer the more input you provide, up to the point where the AI is just making up plausible answers.

The AI uses an attention mechanism and masking to calculate upon relevant information. Placing 1000 reviews and asking for an overall opinion might be an okay use, but don’t expect the AI has carefully considered each one to come up with a mathematical score for you.


This is a quick example of using embeddings for sentiment analysis. Somebody is disillusioned at not getting their pictures: how similar is it to some generic reviews?

 == Cosine similarity comparisons ==
0:" No images from DALL-E when I ask, and I just signe" -

1:" Loved it! I would highly recommend!" -
 match score: 0.0744
2:" Was pretty good. It met the specs and not much els" -
 match score: 0.1620
3:" I was dissatisfied with the product. It wasn't wel" -
 match score: 0.2912
4:" Absolutely the worst garbage ever created. I'm sui" -
 match score: 0.2815

The tokens you send are two magnitudes cheaper in price. The cheapest 128k GPT4turbo vs the best embeddings model: $10.00 vs $0.13. As there is no prompt tokens to pay for with embeddings, it costs no more to send 1 or 10000 requests of the same token count (and 2000 in one API call), thus obtaining an independent embeddings vector for every element to be evaluated.

This result can then be compared mathematically against a small collection of other reviews that were human-rated and then had embeddings, to then accumulate similarities and their weights.

When you then find out where the best, worst, and neutral rank in your algorithm, you can also apply broad sentiment analysis based on individual review inspection - each of the same quality.

Sentiment analysis was one of the first emergent abilities discovered: “Hey, this thing can read people’s mood!”

Yep. I’d also go with @_j’s recommendation. It really is the more efficient approach if you have a larger dataset.

If you decide to go with the embeddings-based classification approach and you are new to embeddings & Co., then the following worked example from the OpenAI cookbook will come in handy.