Hello, I encountered a similar issue, which I believe is due to OpenAI. To address it, I devised a solution that proved helpful. I implemented a loop:
while (!description.length) {
logger.log(info, Trying to request description again for the image ${imageUrls});
description = await describeImage(imageUrls, “group by jeans colors, return
indexes and colors, if single image then just index, please remove all spaces
between commas and indexes, index always starts from 0 example: color=
[0];color=[1,2]”);
count++; await setTimeout(2000);
}
In the end, on the second or third iteration, I no longer received an error.
same problem here, passing 2 image URL’s via API, sometimes it works fine, other times it fails with “invalid_image”. Exact same API calls and URLS, seems intermittent. I did notice that the ones that work seem to have the same remote OpenAI IP address, but that could be a coincidence.
There are many sites like Imgur or Reddit that will take an image file link and rewrite them into an HTML page if they aren’t being embedded into that site.
When you use the URL parameter, you have no control over this, but the behavior should at least be consistent.
It’s possible that the image site operator has proactively blocked some of OpenAI’s image request IP addresses, but not all of them.
Downloading the image yourself may be slower, but is the better choice. You can resize it to control costs and improve transmission speed, and also set up some referrer and header impersonation rules for many popular sites to get to the image specified. You can see the failure without having to call OpenAI.
2,3Mb JPG - “You uploaded an unsupported image. Please make sure your image is below 20 MB in size and is of one the following formats: [‘png’, ‘jpeg’, ‘gif’, ‘webp’].”
gpt-4o and gpt-4o-mini model error.
Hi, Im having the same issue. I tried to encode in base64 but having the same error… Also try adding the max_tokens and changing the detail for the image, but doesnt work. Neither working retrying the same request.
this is an example of the problematic images
I got iPhone shot pictures to work by converting the image to webp instead of jpeg. The problem is that conversion to WebP takes a lot of resources and greatly increases time from shooting an image to posting it to the endpoint
Same problem for Android (samsung s23) images. Happens since a week ago now.
Seems like it is on openAI’s side? Hope this gets fixed soon.
UPDATE:
When I crop the phone taken image to a smaller size and then use it as input, no error occurs. So it does seem to be a resolution thing as @n1kron92 suggested
For a week or so now I’ve been experiencing the same issue with images taken on my iPhone. Best solution I’ve found is to take whatever image you’re using and write a bit of code to convert it to a new .jpg or .png file.
UPDATE2: what helped is to resize the input image. in my case the input image gets saved locally and then its img path is used to encode base64 before it gets to the API. Now the image gets resized (according to the vision preferences) and then enccoded.
# Function to resize the image while maintaining aspect ratio
def resize_image(generic_image_path, max_short_dimension=768, max_long_dimension=2000):
# Open the image file
generic_image = Image.open(generic_image_path)
generic_width, generic_height = generic_image.size
# Determine which dimension is the limiting factor
if generic_width > generic_height:
scaling_factor = min(max_long_dimension / generic_width, max_short_dimension / generic_height)
else:
scaling_factor = min(max_short_dimension / generic_width, max_long_dimension / generic_height)
# Calculate new dimensions
new_width = int(generic_width * scaling_factor)
new_height = int(generic_height * scaling_factor)
# Resize the image
generic_image = generic_image.resize((new_width, new_height), Image.ANTIALIAS)
return generic_image
# Function to convert the image to base64 (as requested by OpenAI)
def image_to_base64(generic_image_path, max_short_dimension=768, max_long_dimension=2000):
# Resize the image
resized_generic_image = resize_image(generic_image_path, max_short_dimension, max_long_dimension)
# Convert resized image to base64
generic_buffered = io.BytesIO()
resized_generic_image.save(generic_buffered, format="JPEG") # You can change format if necessary
return base64.b64encode(generic_buffered.getvalue()).decode('utf-8')
# Function to extract description/code from the traffic violation notice image
def extract_image_details(openai_api_key, generic_image_path):
# Encode the image
base64_encoded_image = image_to_base64(generic_image_path)
generic_headers = {
"Content-Type": "application/json",
"Authorization": f"Bearer {openai_api_key}"
}
generic_payload = {
"model": "gpt-4o",
"response_format": { "type": "json_object" },
"messages": [
{
"role": "user",
"content": [
{
"type": "image_url",
"image_url": {
"url": f"data:image/jpeg;base64,{base64_encoded_image}"
}
}
],
},
],
"temperature": 0,
"max_tokens": 300
}
generic_response = requests.post("https://api.openai.com/v1/chat/completions", headers=generic_headers, json=generic_payload)
return generic_response
Today, I received yet another Invalid image error.
I’m sending 25 URLs in a single request
All images are 1309x638 dimension
All images are accessible and valid PNG images
The total size of the request is ~2.5MB
Requests with bigger image dimensions have no issues.
It would be great to get an actionable error message instead of that cryptic Invalid image message.
UPDATE:
It looks like it’s related to the token limit but not the image. The request succeeded after reducing the request size (deleting either text or images).
I’m still confused. According to my approximate calculations, the request token limit is ~37k. At the same time, I’m using the GPT-4o, which has a context window size of 128k. It looks like the actual limit is around 30k. Something doesn’t add up here to me.
Failing request:
25 images * ~1105 tokens + (9263 tokens for system & user messages) ~= 36,888 tokens
Successful request:
19 images * ~1105 tokens + (7580 tokens for system & user messages) ~= 28,575 tokens
Successful request:
19 images * ~1105 tokens + (8858 tokens for system & user messages) ~= 29,853 tokens
Successful request:
20 images * ~1105 tokens + (6204 tokens for system & user messages) ~= 28,304 tokens
Those API responses, token calculations, and limits are so confusing.
UPDATE 2:
btw: our organization is currently in Usage tier 5 .