Search differents word in pdf file and then give as feedback tha page where find the word

Hi I’m trying to train a gpt to search some words in a pdf file. Really simple as using search option in acrobat reader.
After a lot of iteration I can get to acomplish this simple task.
What is the best approach to this?

Hi and welcome to the community forums.

Could you elaborate on what you mean by “I’m trying to train a gpt”?

You might want to read the pdf using OCR e.g. pytesseract, you may also want to split the pages prior using pytesseract and you may also want to create a data structure - or you can use assistant api if you want to rely on openai’s ability to do the ocr part (they most probably create chunks from it and store it in a vector db - which might not be an ideal solution)…