Since today I’m getting many 400 errors when sending requests to analyze jpeg images using the openai library (OpenAI.chat.completions.create, model: “gpt-4-vision-preview”). It seems quite inconsistent/flaky too, as they do not appear for every image and rerunning sometimes solves the issue. The error specifically:
Error code: 400 - {‘error’: {‘message’: “You uploaded an unsupported image. Please make sure your image is below 20 MB in size and is of one the following formats: [‘png’, ‘jpeg’, ‘gif’, ‘webp’].”, ‘type’: ‘invalid_request_error’, ‘param’: None, ‘code’: ‘image_parse_error’}}.
I’m sending the base64 representation of the image ( i.e. through the image_url object inside of the message content; ‘data:image/jpeg;base64’ ), compressed within 20MB, and also saved it locally to verify manually and all seems fine and in line with the requirements.
I’m only experiencing these errors since today so it seems like a bug was introduced today/yesterday. Can you please have a look into this?
Also, if anyone has any insights into this, it’d be much appreciated. May you need any more information from my side, please let me know!
There’s really no need to be sending something that big.
The vision endpoint will resize the longest dimension to be under 2048 pixels, and then resize so the shortest dimension is max 768.
That makes a standard 3:2 ratio picture out of a camera or phone be 1152x768 - and then you also pay for 6 tiles, a tile being the 512x512 underlying vision model unit.
That 1152x768 that the AI actually sees then: maximum 2.66MB with lossless image, and lossless PNG with compression, closer to 1.5MB.
Smart image resizing before sending will help your budget, improve the speed server-side by eliminating resize (and perhaps improve the symptoms), and cut down on the network transmission.
Thank you for the advice! I’m definitely not sending images anywhere close to 20MB. I mentioned it being within those limits to indicate that the size should not be the cause of the problem - it being within the accepted limits. FYI: I automatically rescale the images I’m sending, generally they’ll be around 1MB-2MB. They’re high resolution though, like 8000x12000, due to upscaling software used. So I also resize them to be within reasonable limits ( 10 megapixels ), although this should already happen when setting the detail=‘low’ param on the API.
As for the pricing, I believe this shouldn’t make a difference as I’m using this detail = ‘low’ setting, i.e. low resolution mode
For the model input, yes. I believe it gets rescaled to that automatically in the pipeline on OpenAI’s side from the API to the model, when using the low-res mode. I’ve been sending larger resolution images using the low-res mode since it launched and has always worked fine.
Quoting from their own documentation:
A 4096 x 8192 image in detail: low most costs 85 tokens
Regardless of input size, low detail images are a fixed cost.
The “magnets” screen grab (looks like you also transcribe Feynman) is 1194x908
That is resized by the API to 1010x768.
It may be that the vision model doesn’t like black in the resulting four tiles and the extra fill.
Sending at under 512x512 and/or setting API quality parameter to “low” may make these problems go away, especially when source is SD video that is under 640x480.
Hey! This image also works for me, it might have to do with some formatting change from being uploaded to the forum platform and then downloaded by me.
Unfortunately still no response from help.openai.com. I’ve temporarily built in an autonomous retry mechanism that’ll eventually process all, but it’s far from ideal.
I agree, I’m experiencing the exact same non-deterministic behavior. I’ve retried the same image multiple times, and sometimes it’ll go through fine, other times it won’t. Also it can get through the 1st or 2nd time, but it can also take many retries, it’s very flaky.