I set out to develop an examMaker App tailored for mathematical exams, and a key part of this process involved converting numerous math exams from PDF to LaTeX format. After experimenting with various tools and models, I discovered that only two options proved consistently reliable:
1- Mathpix: Widely regarded as the premier tool for converting images and PDFs containing mathematical expressions and tables into LaTeX. However, there are some drawbacks:
a- It comes with a hefty price tag.
b- Despite its prowess, it struggles with large tables containing numerous mathematical equations, such as the Cambridge Markscheme 9709_s22_ms_11.pdf (link https://papers.gceguide.net/A%20Levels/Mathematics%20(9709)/2022/9709_s22_ms_11.pdf).
2- InftyReader (version 3.1.1.2): While an older version and only available unofficially (via a mod version), it comes in second in terms of accuracy after Mathpix.
Other tools I experimented with include:
3- InftyReader (version 3.3.2.3, latest version): I do not recommend purchasing this software for the following reasons:
a- While relatively affordable (around $40), it fails to extract images, such as diagrams and function plots, which are crucial for my task.
I b- t is slower than the older mod version.
c- Unfortunately, the license is only valid for the latest version, necessitating the use of the mod version if you want a better qualaty.
4- Meta Nougat (base and small models): This free model for converting PDFs into LaTeX has its limitations:
a- It operates at a sluggish pace, even when run on cloud-based platforms like Colab (free version), taking approximately 9 minutes to process a 20-page document.
b- The output is often inaccurate, with missing numbers, text, and equations.
c-Additionally, it outputs in Markdown (MMD) format, requiring further conversion to LaTeX.
For those interested in exploring Meta Nougat, you can refer to this tutorial (https://www.youtube.com/watch?v=SYO_4uhdHKM ). However, based on my experience, I advise caution due to its speed and accuracy issues.
After investing significant time in research and experimentation, I’ve arrived at these conclusions in the hope of saving others facing similar tasks valuable time. These insights are borne out of extensive trial and error, and I trust they will serve as a helpful guide for anyone navigating the intricate landscape of PDF to LaTeX conversion tools.