The simplest solution is to convert the PDF into images, and then use the vision capability: https://platform.openai.com/docs/guides/vision
You can test the results in the playground to see if they are suitable for your case.
To convert a PDF into images with ImageMagick, the command is as simple as:
convert -density 300 input.pdf -background white -alpha remove -alpha off page-%d.jpg
It’s also easy to limit the number of pages to, say, 10 pages, with:
convert -density 300 'input.pdf[0-9]' -background white -alpha remove -alpha off page-%d.jpg
The gpt4o-mini ability to understand the content (both text and images) is great. More than I needed for my use case: the ability for our user to upload any kind of PDF and get a draft to work on.