GPT-4 omni text recognition via API works worse than on chatgpt.com

mariayeruh · May 31, 2024, 9:52am

Hello all!

I’m investigation poor text recognition via API with GPT-4 Omni.
The original OpenAI chat on chatgpt_com is working like a charm, text is 100% equal to PNG, no fictional words or sentences.
If I us API call to GPT-4o for a one page text I always get only the first paragraph almost correct, the others are fictional.

I tried custom prompts to stop using Tesseract and use internal vision capabilities. But no luck. What should I do?

Merlin · May 31, 2024, 10:07am

Hey

This can be a bit counter intuitive, but you actually gave it the image as an attachment and not as an image.

There is two different things you can do with images and GPT.
Give it the file, which it can then use when coding (what you did)
or you give it the image specifically for vision.

You are doing the wrong one.

This is what it looks like on playground: (use the right button)
Screenshot 2024-05-31 at 12.04.41

In the api it looks like this:

from openai import OpenAI

client = OpenAI()

response = client.chat.completions.create(
  model="gpt-4o",
  messages=[
    {
      "role": "user",
      "content": [
        {"type": "text", "text": "What’s in this image?"},
        {
          "type": "image_url",
          "image_url": {
            "url": "https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg",
          },
        },
      ],
    }
  ],
  max_tokens=300,
)

print(response.choices[0])

If you still need help, let me know and please explain your set-up further.

mariayeruh · May 31, 2024, 12:48pm

Hi!
Yes I used image attachment.

WashedUp · June 17, 2024, 2:42am

you ever figure out a fix for this?

sairamkancharla2002 · August 13, 2024, 7:36pm

Same thing me also I need to generate acceptance criteria for description
Description contains image and data
It was reading only image

I need to read both text and image gpy and analyze ra

Topic		Replies	Views
Make OpenAI Vision API Match GPT4 Vision API chatgpt	4	3551	December 6, 2023
Is GPT4-o dumber in Assistans API than in normal chat? API gpt-4o	3	771	September 7, 2024
How to solve the problem that GPT-API cannot read text using OCR? API	19	2948	July 10, 2024
The performance difference between ChatGPT4o and gpt4o api using the same prompt for image analysis API gpt-4 , chatgpt , gpt-4-vision , gpt4-vision , api-vision	5	870	July 27, 2024
Getting data from other peoples images on vision API Bugs gpt-4	1	65	August 17, 2024

GPT-4 omni text recognition via API works worse than on chatgpt.com

Related topics