I’m running hundreds of 100 pixel by 75 pixel small images a day, plus other sizes at the moment without issue, not had a single error in the past 7 days.
this is my calling function
def analyze_image_with_openai(image, client, custom_prompt):
base64_image = encode_image(image)
chat_completion = client.chat.completions.create(
messages=[
{
"role": "user",
"content": [
{"type": "text", "text": custom_prompt},
{
"type": "image_url",
"image_url": {
"url": f"data:image/png;base64,{base64_image}",
"detail": "high"
}
}
]
}
],
max_tokens=300,
model="gpt-4-vision-preview",
)
Thank you for sharing. This seems to be the similar as for my requests, although there’s 2 differences that could be key:
- The images I’m sending are jpegs instead of png.
- I’m using the detail: low mode instead of detail: high.
I’ll try to do some debugging on my side to see if any of those factors could be the issue.
I do hope the devs will look into this ASAP because it’s been a bottleneck for quite some time now, and it’s evidently a bug as it can break with proper images as reported by various users including example images.
UPDATE: It seems it’s actually heavily dependent on the sizing; the higher the resolution the flakier it gets. So far it seems I need to resize the images to as low as around the 100x75 resolution that you’re working with to make it work consistently. However, at this resolution apparently it’s making a lot more errors in the vision outputs on my images ( i.e. incorrect descriptions ). It seems I need at least 128x128 to make it perform well, which will again introduce the flakiness a bit ( although a lot less than in higher resolutions ). Either way thanks a lot for sharing, I can at least limit the flakiness some now.
2 Likes
This is a bunch of co-ordinates for bounding box’s I am cropping down and sending to the model:
(867, 29, 1009, 63)
(248, 231, 300, 661)
(310, 66, 750, 167)
(519, 243, 965, 521)
(525, 569, 986, 683)
(528, 698, 987, 765)
(105, 208, 130, 233)
As you can see, some are very narrow and small. I’ve just tried about 150 calls with an image 25 pixels by 25 pixels containing a single x% reference value and it’s worked every time.
Have you tried using png? I’m at somewhat of a loss to try and explain what you are experiencing.
1 Like
Ya. I’ve also seen this from some ad hoc testing. Unfortunately, as mentioned reducing scale degrades the performance. Maybe I will run some evals to test this.
I encounter the same issue, sometimes, I got that error message about image size must below 20, but when I rerun the same image, it works again, it is very unstable.It is very unstable, I currently use try block to catch that, and continue asking for response in catch block. Now, it seems that the problem has been temporary overcom, but it is just a temporal solutions, hope if anyone can fix it.
1 Like