How to extract technical expressions from PDFs so that they can be understood by AI?

Great question,
I’ve puzzled with this for a long time myself.

I would always advice you to try and obtain the source files for the document (usually a .tex file)

If that isn’t possible you can use an image2text AI to convert it to latex, but this isn’t always reliable, the only service I’ve tried that could do this reliably is mathpix:

Mathpix is a paid service, but it has an API that accepts PDFs and convert them to latex, math included.

I would be very interested if anyone knows of an open source equivalent that does the same.

5 Likes