Gpt-4o is not recognising the image properly

mathu · June 25, 2024, 7:44am

We are facing problem with chat completion api. The image recognised is wrong in the chat completion API. When I gave the below image, the response was not correct. The question was “find the slope of the line”. The response I received was that the two points are (-1, -4) (5, 2), but the correct points are (0, -3) (4, 0). The image is not working properly.

{
    "model": "gpt-4o",
    "messages":[{
                "role" => "user",
                "content" => [{
                              "type" => "text",
                              "text" => "find the slope of the line"
                             },
                             {
                              "type" => "image_url",
                              "image_url" => {
                                             "url" => "imageurl"
                                            },
                             }]
                }]
}

jr.2509 · June 25, 2024, 8:00am

Welcome to the Forum!

Give it a try with the following prompt:

Please identify the coordinates where the blue line intersects the x and y axis.

Worked well for me - though I will note that the results are not necessarily consistent across API requests. It tends to vary a bit between +/- 3 and +/- 4. However, it is not as far off as in your case.

NB: Based on your post, it does not seem like you are expecting a value for the slope of the line. Hence the change in prompt.

_j · June 25, 2024, 8:11am

I handed the image over to ChatGPT using gpt-4o, and more extensive techniques that could be used, even hinting that code interpreter was available.

Let’s identify two points from the graph:

The line crosses at (0,−4) and (6,2).

Vision failed, where the correct answer by my own looking would be (0, -3), (6, 1.8) approximately, a slope of 0.8. I even gave instruction to interpolate the values between grid, which was not done.

I improved the instruction, telling more about determining line crossing points:

From visual inspection, we can estimate the coordinates of these crossing points.

Let’s select a few points where the line crosses the grid lines:
(0, -4)
(1, -3)
(2, -2)
(3, -1)
(4, 0)
(5, 1)
(6, 2)

It even sent the image to python and had its own pixel grid overlaid. Nope.

This kind of vision analysis can be boosted in quality by multi-shot examples preceding the final problem. Lead-up messages of user input with image accompanied with correct simulated AI answers can both orient it to the task with more context and also improve the actual analysis.

Topic		Replies	Views
Gpt-4o api giving wrong response for image type questions API api	4	372	July 2, 2024
GPT-4o forgets image data and sometimes gives answers that have nothing to do with the image API api , assistants-api , gpt4o	10	1127	June 15, 2024
Image_url for gpt-4o api giving error "expected an object, but got a string instead.", Bugs gpt-4 , api	12	10159	July 1, 2024
GPT unable to view the content of image API image-reading , gpt-4o	0	130	November 1, 2024
Gpt 4o sometime not able to access image from url Bugs image-reading , gpt4o	1	589	August 12, 2024

Gpt-4o is not recognising the image properly

Related topics