Responses based on 100 PDF Input Feeds

I understand ChatGPT API will return results based on past conversation and generally available knowledge.

But how do I feed in data of some 100 PDFs and get answers to questions based on the input PDFs ?

Short answer is, you don’t. That’s far too many documents for ChatGPT to handle.

You’ll need to use some other system for information retrieval.

I have the original text that was used in the PDFs - so how about 100 text files as input and get ChatGPT API to answer questions based on the 100 text files alone ?

If you want to use the API with retrieval you are limited to the assistants API, unless you want to build your own RAG.

With the assistants API you’re limited to 20 documents of not more than 2,000,000 tokens each.

If you can combine your 100 documents into 20, each with less than 2,000,000 tokens, you can do it.

So what’s my next step in implementing RAG in code ? Where do I start ?

There are a TON of articles and tutorials on YT. The answer depends on your skills and platform you are working on…

I would like this to be implemented in my django based application.