I’m trying to build an application that could take in PDFs and images uploaded by professors, store them in the backend, and use these materials as the relevant context in a chatbot-like student interface. In general, how can I use the GPT-4 API to accomplish this? The purpose of the application is to not only use the text within PDFs as context, but also the images, graphs, and visual representation.
I can not help you with your general question by can give some advice for this part
Internally a PDF file is more like an archive file in that it is a collection of different resources organized into a hierarchy. The images, graphs, and visual representations can be stored as different resources.
The first major note about the internals of a PDF file is that most of the text and images created using LaTeX will be as PostScript source code. This makes it very hard to extract the non-text part such as graphs into a meaningful means for use other than as a display representation.
However if is not uncommon to find images that were created external to LaTeX to be included as originally created, often as image files and such.
While there are many free PDF to text applications and sites, to do what you desire will most often require commercial software, e.g. Adobe Acrobat or IDR Solutions BuildVU.