"Invalid image" error in gpt-4-vision

Hello, I encountered a similar issue, which I believe is due to OpenAI. To address it, I devised a solution that proved helpful. I implemented a loop:

while (!description.length) {
logger.log(info, Trying to request description again for the image ${imageUrls});
description = await describeImage(imageUrls, “group by jeans colors, return
indexes and colors, if single image then just index, please remove all spaces
between commas and indexes, index always starts from 0 example: color=
[0];color=[1,2]”);
count++; await setTimeout(2000);

}

In the end, on the second or third iteration, I no longer received an error.

Just chiming in here. I am facing the same issue with S3 pre-signed URLs or anything behind a Cloudflare WAF that requires a token to access the file

Anyone find a solution to this yet?

same problem here, passing 2 image URL’s via API, sometimes it works fine, other times it fails with “invalid_image”. Exact same API calls and URLS, seems intermittent. I did notice that the ones that work seem to have the same remote OpenAI IP address, but that could be a coincidence.

There are many sites like Imgur or Reddit that will take an image file link and rewrite them into an HTML page if they aren’t being embedded into that site.

When you use the URL parameter, you have no control over this, but the behavior should at least be consistent.

It’s possible that the image site operator has proactively blocked some of OpenAI’s image request IP addresses, but not all of them.

Downloading the image yourself may be slower, but is the better choice. You can resize it to control costs and improve transmission speed, and also set up some referrer and header impersonation rules for many popular sites to get to the image specified. You can see the failure without having to call OpenAI.

Still facing this error “Invalid image” in gpt-4o-mini as well, but its too frequent in gpt-4o.

2,3Mb JPG - “You uploaded an unsupported image. Please make sure your image is below 20 MB in size and is of one the following formats: [‘png’, ‘jpeg’, ‘gif’, ‘webp’].”
gpt-4o and gpt-4o-mini model error.

UPD: That was a resolution problems in our case.

Can you provide more detail about what you mean by “resolution problem”?

Hi, Im having the same issue. I tried to encode in base64 but having the same error… Also try adding the max_tokens and changing the detail for the image, but doesnt work. Neither working retrying the same request.
this is an example of the problematic images

s3.amazonaws.com/static.picap.co/packages/66ac0529b6ba65004e0dc715/pickup_photos/DA008149-39AC-466E-B1E1-967F8D8FA96B.jpg

Something curious that I found its that its failing the urls generated by iPhone image. The android images urls like that work normally:

s3.amazonaws.com/static.picap.co/packages/66ad3ed1b6ba65005e0de4d8/deliver_photos/CAP7972257294176822232.jpg

Same issue with iPhone pics

Having a similar issue with base64 encoded images, also coming from an iphone camera

having the same type of issue

I got iPhone shot pictures to work by converting the image to webp instead of jpeg. The problem is that conversion to WebP takes a lot of resources and greatly increases time from shooting an image to posting it to the endpoint

Same problem for Android (samsung s23) images. Happens since a week ago now.
Seems like it is on openAI’s side? Hope this gets fixed soon.

UPDATE:
When I crop the phone taken image to a smaller size and then use it as input, no error occurs. So it does seem to be a resolution thing as @n1kron92 suggested

For a week or so now I’ve been experiencing the same issue with images taken on my iPhone. Best solution I’ve found is to take whatever image you’re using and write a bit of code to convert it to a new .jpg or .png file.

2 Likes

UPDATE2: what helped is to resize the input image. in my case the input image gets saved locally and then its img path is used to encode base64 before it gets to the API. Now the image gets resized (according to the vision preferences) and then enccoded.

# Function to resize the image while maintaining aspect ratio
def resize_image(generic_image_path, max_short_dimension=768, max_long_dimension=2000):
    # Open the image file
    generic_image = Image.open(generic_image_path)
    generic_width, generic_height = generic_image.size

    # Determine which dimension is the limiting factor
    if generic_width > generic_height:
        scaling_factor = min(max_long_dimension / generic_width, max_short_dimension / generic_height)
    else:
        scaling_factor = min(max_short_dimension / generic_width, max_long_dimension / generic_height)

    # Calculate new dimensions
    new_width = int(generic_width * scaling_factor)
    new_height = int(generic_height * scaling_factor)

    # Resize the image
    generic_image = generic_image.resize((new_width, new_height), Image.ANTIALIAS)

    return generic_image

# Function to convert the image to base64 (as requested by OpenAI)
def image_to_base64(generic_image_path, max_short_dimension=768, max_long_dimension=2000):
    # Resize the image
    resized_generic_image = resize_image(generic_image_path, max_short_dimension, max_long_dimension)

    # Convert resized image to base64
    generic_buffered = io.BytesIO()
    resized_generic_image.save(generic_buffered, format="JPEG")  # You can change format if necessary
    return base64.b64encode(generic_buffered.getvalue()).decode('utf-8')

# Function to extract description/code from the traffic violation notice image
def extract_image_details(openai_api_key, generic_image_path):
    # Encode the image
    base64_encoded_image = image_to_base64(generic_image_path)

    generic_headers = {
        "Content-Type": "application/json",
        "Authorization": f"Bearer {openai_api_key}"
    }

    generic_payload = {
        "model": "gpt-4o",
        "response_format": { "type": "json_object" },
        "messages": [
            {
                "role": "user",
                "content": [
                    {
                        "type": "image_url",
                        "image_url": {
                            "url": f"data:image/jpeg;base64,{base64_encoded_image}"
                        }
                    }
                ],
            },
        ],
        "temperature": 0,
        "max_tokens": 300
    }

    generic_response = requests.post("https://api.openai.com/v1/chat/completions", headers=generic_headers, json=generic_payload)

    return generic_response


I am facing this exact issue as well - OpenAI 400 (Invalid image format, file-size needs to be less than 20Mb)

I am sending base64 encoded images to OpenAI. Using gpt-4o

Checked the images, it has .png format and size is less than 5MB

Added resize function to reduce size by 50%, still same error.

Even tried to remove the EXIF metadata (seems like images taken from iPhone has extra metadata that will cause OpenAI Errors), but same error.

Any other solutions I can try?

Scaling down all my images to a 2000x768 solved my issues. But the error message could be better.

Today, I received yet another Invalid image error.

  • I’m sending 25 URLs in a single request
  • All images are 1309x638 dimension
  • All images are accessible and valid PNG images
  • The total size of the request is ~2.5MB

Requests with bigger image dimensions have no issues.

It would be great to get an actionable error message instead of that cryptic Invalid image message.

UPDATE:

It looks like it’s related to the token limit but not the image. The request succeeded after reducing the request size (deleting either text or images).

I’m still confused. According to my approximate calculations, the request token limit is ~37k. At the same time, I’m using the GPT-4o, which has a context window size of 128k. It looks like the actual limit is around 30k. Something doesn’t add up here to me.

Failing request:
25 images * ~1105 tokens + (9263 tokens for system & user messages) ~= 36,888 tokens

Successful request:
19 images * ~1105 tokens + (7580 tokens for system & user messages) ~= 28,575 tokens

Successful request:
19 images * ~1105 tokens + (8858 tokens for system & user messages) ~= 29,853 tokens

Successful request:
20 images * ~1105 tokens + (6204 tokens for system & user messages) ~= 28,304 tokens

Those API responses, token calculations, and limits are so confusing.

UPDATE 2:

btw: our organization is currently in Usage tier 5 .