Best practice for a big RAG

JP202 · May 10, 2024, 7:02pm

Hi guys

I wondered if someone is willing to share how they dealt with an extensive document.

Let’s do a simple one - in the database, we have a session transcript of 15,000 words. How to respond to a user’s question

give me a top 5 takeaways from the session

Obviously, 15,000 words are too big for the ChatGPT3.5 token window. GPT4 can accept it, but it’s dreadfully slow, varying between 20 seconds and over 2 minutes.

So what is your approach with ChatGPT3.5

thanks

N2U · May 10, 2024, 7:53pm

If you need to answer questions like these that concern the entire context, then you’ll either need to send the entire context or create a summary that you can retrieve to answer the question.

JP202 · May 10, 2024, 7:55pm

Yes exactly but obviously, I can’t send the full content as it is bigger than the token window

N2U · May 10, 2024, 8:00pm

Use GPT-4-turbo instead, it’s context window is 128k

JP202 · May 10, 2024, 8:12pm

thanks mate, it does. But it takes 30+ up to 2 minutes to get answer from it

PaulBellow · May 10, 2024, 8:25pm

Yeah, in my experience, even at Tier 5 billing, it can take minutes for 4k+ prompts…

Could you split it up maybe?

kodexolabs · May 10, 2024, 9:04pm

Honestly, I’d probably break it down into smaller chunks, like 1000-1500 words each. That way, I can focus on one chunk at a time and use ChatGPT3.5 to help me extract the key points. It’s like eating an elephant, one bite at a time!

Another approach I’d take is to use some keyword extraction techniques to identify the most important phrases and keywords. That way, I can quickly see what the document is about and what’s most relevant.

If I had to get really manual, I’d just sit down and read the thing, taking notes as I go. It’s old school, but sometimes that’s the best way to really understand what’s going on.

Lastly, I might use some other tools, like spaCy or NLTK, to help me preprocess the document and extract key points. It’s like having a team of experts helping me out!

So yeah, that’s how I’d tackle that beast of a document!

wclayf · May 11, 2024, 1:53am

Maybe you could come up with some scoring mechanism where you can take any paragraph and assign it a score, either by using Cosine Similarity (i.e. vector database) to some “known” vector in semantic space, or just doing a prompt that asks for a scoring based on certain criteria. That way even if you are splitting up the content into chunks of arbitrary sizes the scoring mechanism will still sort of be “apples to apples” comparisons so to speak. Then it’s a simple matter of choosing the top N (like N=5) pieces of content if you need to…because you generated the “score” of one or more of them totally independently in separate prompts.

Topic		Replies	Views
Summarizing and extracting structured data from long text Prompting gpt-4 , api , token , limitations	14	12430	February 19, 2024
Practical Tips for Dealing with Large Documents (>2048 tokens) API	6	8492	December 17, 2023
Sending large document via API call and asking for a question over complete document? Prompting api	3	1742	February 26, 2024
⬛ Splitting / Chunking Large input text for Summarisation (greater than 4096 tokens....) API	24	44952	December 12, 2023
How should a program be written to summarize a long text using an API, and what are the considerations regarding the maximum number of tokens allowed? API	2	2093	April 19, 2024

Best practice for a big RAG

Related topics