Hey everyone, I’m building a simple project for myself, and wanted to know: whats the best way to efficiently interact with super long PDFs (2000+ pages) ? I want to extract information efficiently and have a chatbot interface for easy querying. Any tips or approaches to consider for making this process simple and super effective? Like, should i go the classic approach of langchain + vector db or maybe even try building a custom GPT for this? I care that this is super accurate but I’d love your thoughts on how you’d do this since its super long documents. Thanks for your help!
Did you end up implementing this? If so what approach did you go with
When it comes to handling PDF interactions, the complexity of your approach and your specific goals will determine the best approach for you. Here are two effective methods to consider:
1. Easy Implementation with File Search Feature
For a straightforward solution, you can utilize the file search feature available in the Assistants API. This approach allows you to quickly implement a functional PDF search capability. It’s ideal if you need a simple and efficient way to search through PDF documents.
2. Custom Tool for Enhanced Functionality
If you’re looking for a more tailored approach, consider creating a custom tool that gets invoked whenever a user asks a question about the PDF. This method give us more control, enabling you to:
- Develop a Custom Splitting Strategy: Define how your PDF content is divided for more accurate retrieval.
- Implement Custom Retrievers: Use specific retrievers to fetch relevant information based on the user’s query.
EOD you should be passing subset of information which is related to user query back to the LLM. Check out the other posts in the forum where others talked about context based splitting.
Combining your custom tool with the thread management capabilities of the Assistants API, it should do the job. Hope this helps. Cheers