How to let chatgpt fully digest a really large text?

wai2018 · March 24, 2023, 10:02am

I’m trying to use ChatGPT to analyze a large text, such as a book, and extract every detail from it. However, I’m not sure how to do this effectively. I’ve tried breaking the text into chunks and using embeddings, but this method seems to lose important contextual information.

For example, if I break up a sentence like “The dog bites Peter. It runs away after that. The cat smiles,” into three separate chunks, and then ask ChatGPT a question like “Which animal runs away after Peter gets bitten?”, it won’t be able to provide the correct answer.

Do you have any suggestions for how to let ChatGPT fully digest a large text without losing important context?"

udm17 · March 24, 2023, 10:54am

Rather than break it into small sentences, try and embed a paragraph (ex 3 sentences here) as a whole, so that the context is captured correctly. The length isn’t always going to be fixed, so you’ll have to try and figure out what works best for your case.

wai2018 · March 24, 2023, 11:13am

I have chunked the entire book into several paragraphs. but I’m still worried that some paragraphs might depend on information from previous paragraphs.

Ozlem_Williams · March 24, 2023, 11:14am

Share a full page and then ask it to read and “say Read when you read”. Then share the second page… Keep going.

udm17 · March 24, 2023, 12:37pm

If it’s an entire book, maybe try chapter wise or half a chapter wise. GPT will have some inherent knowledge as well, so it becomes a situation of whether it will be able to connect the dots or not, which it should be able to do. Like i said, most of the times, there’s no fixed answer to such problem and you’ll have to try and test which works best for your situation.

aeg13_oth · June 28, 2023, 12:29am

wai2018 you as a very good question. I’m not sure why people provide terrible answers instead of simply saying ‘I don’t know’… maybe they are ChatGPT bots themselves haha… Besides that, if there was a way to perform a search to return content across chunks and then merge them into a temporary chunk to answer the question this could be a solution. Not sure how a program could do this though.

Foxalabs · June 28, 2023, 10:24am

You can implement embeddings overlap, describe here : The length of the embedding contents - #23 by curt.kennedy

Topic		Replies	Views
Answering questions about text file content API	5	8918	December 15, 2023
Chatbot with user provided files: how to let GPT have a "overall" view of the file content? API	3	1484	December 16, 2023
Add book content to the model (both details and full context of the book) API embeddings , fine-tuning , rag	6	92	March 23, 2025
Feed a large size data to ChatGPI API API chatgpt	1	755	May 9, 2024
How to build a Question and Answer Bot for context greater than 2048 tokens? Prompting	3	1747	December 17, 2023

How to let chatgpt fully digest a really large text?

Related topics