Efficiently Interacting with super super Long PDFs/documents

Hey everyone, I’m building a simple project for myself, and wanted to know: whats the best way to efficiently interact with super long PDFs (2000+ pages) ? I want to extract information efficiently and have a chatbot interface for easy querying. Any tips or approaches to consider for making this process simple and super effective? Like, should i go the classic approach of langchain + vector db or maybe even try building a custom GPT for this? I care that this is super accurate but I’d love your thoughts on how you’d do this since its super long documents. Thanks for your help!

1 Like