Does OpenAI prohibit OCR?

dandyfiner · April 18, 2025, 3:51pm

The documentation says input images may not contain text. Specifically, under Image Input Requirements → Other Requirements it says, “No Text”.

I just want to verify: does this means using OpenAI APIs for OCR

is prohibited?
is not supported?
is not possible?

We have used it successfully for OCR in the past. Also in the same document, there is mention of processing text in images, like that the model is not great at non-English and small text.

So what’s the story?

Thanks!

akitaishi · April 18, 2025, 4:00pm

Hi,

I do not think that OCR is prohibited. I might be wrong.
Personally speaking, I was able to get OCR with accuracy for my web app that I built.

I would like to listen to the other experts on this too, if there are any limitations.

Cheers.
Akitaishi

ben60 · April 19, 2025, 5:16am

In my experience:
4o can read text in images jpg/webp/png/screenshots pasted directly in.
A few issues with PDF scanned docs but nothing worse than existing OCR.
I can’t comment on non-English as not tried that

phyde1001 · April 19, 2025, 5:59am

OK this is what 4o says:

In the OpenAI Images API documentation, the guideline “No text” refers to a recommendation to avoid including textual elements within images submitted for processing. This is because models like GPT-4’s vision capabilities are not optimized for interpreting text embedded in images, which can lead to inaccuracies or misinterpretations. By providing images without text, you ensure that the model focuses on visual content, leading to more accurate and reliable analysis.

I have a good example that backs that up…

That said… This section is ‘API Requirements’ - Input images must meet the following requirements to be used in the API.

You might find a better solution here:

sps · April 22, 2025, 8:22pm

Welcome to the dev forum @dandyfiner

Images with text aren’t prohibited AFAIK.

In fact, here’s how OpenAI enables PDF content inputs:

How it works

To help models understand PDF content, we put into the model’s context both the extracted text and an image of each page. The model can then use both the text and the images to generate a response. This is useful, for example, if diagrams contain key information that isn’t in the text.

akitaishi · April 25, 2025, 4:08pm

I use GPT-4o for OCR via API in Algebraic Equation GPT4.

Topic		Replies	Views
Make OpenAI Vision API Match GPT4 Vision API chatgpt	4	3872	December 6, 2023
Can an assistant help me with OCR? API gpt-4	7	3561	June 6, 2024
How to Programmatically Extract Text from Images Using GPT-4 API gpt-4 , chatgpt , api , assistants-api	9	7702	October 14, 2024
GPT4 OCR/Image Recognition API gpt-4	3	25328	December 18, 2023
OCR using API for text extraction API api	9	14764	December 18, 2024

Does OpenAI prohibit OCR?

How it works

Related topics