We are facing problem with chat completion api. The image recognised is wrong in the chat completion API. When I gave the below image, the response was not correct. The question was “find the slope of the line”. The response I received was that the two points are (-1, -4) (5, 2), but the correct points are (0, -3) (4, 0). The image is not working properly.
Please identify the coordinates where the blue line intersects the x and y axis.
Worked well for me - though I will note that the results are not necessarily consistent across API requests. It tends to vary a bit between +/- 3 and +/- 4. However, it is not as far off as in your case.
NB: Based on your post, it does not seem like you are expecting a value for the slope of the line. Hence the change in prompt.
I handed the image over to ChatGPT using gpt-4o, and more extensive techniques that could be used, even hinting that code interpreter was available.
Let’s identify two points from the graph:
The line crosses at (0,−4) and (6,2).
Vision failed, where the correct answer by my own looking would be (0, -3), (6, 1.8) approximately, a slope of 0.8. I even gave instruction to interpolate the values between grid, which was not done.
I improved the instruction, telling more about determining line crossing points:
From visual inspection, we can estimate the coordinates of these crossing points.
Let’s select a few points where the line crosses the grid lines:
It even sent the image to python and had its own pixel grid overlaid. Nope.
This kind of vision analysis can be boosted in quality by multi-shot examples preceding the final problem. Lead-up messages of user input with image accompanied with correct simulated AI answers can both orient it to the task with more context and also improve the actual analysis.