Gpt4o ocr mistakenly bolds math symbol uvw (mathbf)

ficapy · May 23, 2024, 4:48pm

I tried using gpt4o’s Vision API to recognize some mathematical literature and get LaTeX Markdown. In most cases, its accuracy reached a basically usable level. However, there is one particularly noticeable error: for u,v,w it repeatedly recognized them as bolded uvw thousands of times(Not correct even once).

For example, this formula was recognized as:


\[ \mathbf{v} \cdot \mathbf{w}_2 = 0 \]

But it should actually be:


\[ v \cdot w_2 = 0 \]

Although I can post-process and change all \mathbf{v} to v, I still hope the official team can improve this aspect.

N2U · May 23, 2024, 5:02pm

Add “avoid using bold font” to your prompt, here’s an example:

_j · May 23, 2024, 6:05pm

For me gpt-4o wouldn’t even have a look, instead, sending to python. Just a forum screenshot fixes its attitude though on a new PDF-sourced image.

anon22939549 · May 23, 2024, 8:06pm

Some of it might be just random chance. Because, using the model in ChatGPT it produces the correct result,

Though, in the playground I get the boldfaced version.

Now…

It might not actually be a “mistake.”

(Hear me out, please!)

1

Typically vectors are typeset in bold.

From on Wikipedia,

For representing a vector, the common typographic convention is lower case, upright boldface type, as in v. The International Organization for Standardization (ISO) recommends either bold italic serif, as in \mathbf{v}, or non-bold italic serif accented by a right arrow, as in \displaystyle {\vec {v}}.

It could be that Omni is recognizing this as a dot-product between vectors and “correcting” the typesetting according to what it believes it should be.

But, more likely…

2

Your image does appear to be bolded.

I made a *non-*bolded version in \LaTeX which you can see here:

This is a much higher resolution version, but if I crop and scale yours and mine you can see yours is clearly in a heavy typeface.

output

When I provide the lighter typeface version to the gpt-4o model, it seems to get the result correct every time.

So, why does gpt-4-turbo not struggle with this? I don’t know, probably just because it is a larger model.

Anyway, I would spend at least a bit of time seeing if you can just get a better quality original source image before picking a fight with the model over this.

ficapy · May 24, 2024, 3:10am

With my tests yesterday, today the GPT4o web version actually seems to recognize the above image more accurately…

I can’t get a clearer image, and personally, I think the provided scanned photo is clear enough for recognition.

I don’t believe \mathbf{v} and v are very similar; the difference between them is quite significant.

Although the scanned document does bold the v here, if a human were handling it, they wouldn’t mistakenly recognize it as bold in LaTeX (I’ve tried specialized LaTeX OCR tools like Mathpix, SimpleTex, and Pix2Tex).

I added “avoid using bold font for u, v, w” to the prompt, and the results improved somewhat (there were fewer bold fonts, but more LaTeX syntax errors, possibly because the model was more confused, insisting it was bold, resulting in similar invalid LaTeX syntax like \(\mathbf{ u ).

anon22939549 · May 24, 2024, 3:31am

I’m not saying that it’s “seeing” your scanned image as being the result of using \mathbf. I’m saying that it’s “seeing” your scanned image appears to be in a heavy (read: bold) font face and is making a “best effort” to give you what it “thinks” you want.

When I use an image with a lighter-weight font, there is zero issue.

As for humans, there is the \bm command from the bm package which produces a heavier-weight italic serif font.

Topic		Replies	Views
Improve image processing with number "1" and "7" API gpt-4-vision	6	375	November 19, 2024
API GPT-4-Turbo-Preview has output in latex format in math API gpt-4 , api	13	6988	February 17, 2025
OAI forcing the models to use LaTex for calc response, but I don't need LaTex Prompting gpt-35-turbo , api , assistants-api	2	876	April 23, 2024
Math style generated content is not getting consistent wrapped is $ or $$ Prompting gpt-4 , chatgpt	1	503	May 8, 2024
Can GPT-4 render Mathjax as 3.5 does? Prompting	6	2926	March 30, 2023

Gpt4o ocr mistakenly bolds math symbol uvw (mathbf)

1

2

Related topics