CHATGPT API with 200 massive PDF files

I have 200 pdf files vectorized and I am using ChatGPT API with Retrieval tool, But my question is what is the best way to reduce the alucinations y get better precision in the answers?

Welcome to the community!

Your options with retrieval are pretty limited, especially if you don’t want to edit and clean your documents.

In terms of using RAG in general, there’s vigorous debate on the matter. You can join the fray here:

Hi @s.rodriguez , if those 200 pdf files can have some segmentation based on topic, closely related key-words, or any other characteristic based on which they could be clustered, I would suggest to consider adding a routing / classification layer before RAG. I personally used this approach multiple times for bigger volumes of knowledge base and it worked out quite well. Hope that was of some help.

Vasyl, Thank you very much for the suggestion.
Our agent, built in Python with the OpenAI API, processes the 200 vectorized PDF files through the “Retrieval” tool offered by the OpenAI API. Currently, we use two internal assistants before generating a response to any chat query about the PDF documents. We could use a third assistant to handle classification and routing to the PDF(s) related to the searched topic. This way, we could direct it to a specific document(s). We will try it. If you have any other ideas on how to perform the routing, they would be welcome. Thank you!