Vision API flips numbers on extracting text from image

f.jouti · December 12, 2023, 9:29pm

I’m using Vision API to OCR some images, however on several occasions, GPT4 Vision returns incorrect results when dealing with numbers, it simply flips numbers, for example, if the image contains I encountered in many occasions and on several fields, for example on one field instead of extracting 578154181 it extracted 57851418. Note that images are high resolution and the letters and numbers are clearly visible. Anyway to solve this?

Diet · December 13, 2023, 12:07am

I wouldn’t rely on a single shot of GPT4V as OCR. It’s probably best to combine GPT4V with classical OCR to get the best results.

f.jouti · December 13, 2023, 8:20pm

Problem is that it recognizes the right numbers when you try again, it just seems that it randomly makes errors when it wants to.

Diet · December 13, 2023, 8:41pm

LLMs make “errors” all the time. I’m suggesting that it may be a good idea to use other tools that are more reliable to cross-validate or augment the capabilities of LLMs

Topic		Replies	Views
GPT 4 Vision Model misrepresentation of text from an Invoice (OCR Task) API gpt-4	4	1367	July 31, 2024
Worse OCR on rotated Text Prompting gpt4-vision	8	2538	June 26, 2024
GPT-4 Vision Refuses to extract Info from Images? API gpt-4-vision	39	17456	November 19, 2024
Improve image processing with number "1" and "7" API gpt-4-vision	6	238	November 19, 2024
[Realtime API] Audio Output Numbers Wrong Bugs realtime	3	354	March 17, 2025

Vision API flips numbers on extracting text from image

Related topics