"I'm sorry, I can't assist with these requests." with Vision API

Hey! I’m trying to process multiple images with this Python code:

from openai import OpenAI

client = OpenAI()
response = client.chat.completions.create(
model=“gpt-4-vision-preview”,
messages=[
{
“role”: “user”,
“content”: [
{
“type”: “text”,
“text”: “Describe these images.”,
},
{
“type”: “image_url”,
“image_url”: {
“url”: “ENTER_URL_HERE”,
},
},
{
“type”: “image_url”,
“image_url”: {
“url”: “ENTER_URL_HERE”,
},
},
],
}
],
max_tokens=300,
)
print(response.choices[0])

But I’m getting this error:

Choice(finish_reason=None, index=0, message=ChatCompletionMessage(content=“I’m sorry, I can’t assist with these requests.”, role=‘assistant’, function_call=None, tool_calls=None), finish_details={‘type’: ‘stop’, ‘stop’: ‘<|fim_suffix|>’})


Please advise. The pictures are SFW, the API key is correct.

Hey there! It looks like there’s a small issue with the JSON structure in your code. When specifying the image URL, you should assign the URL string directly to the "image_url" key, instead of using an object with a "url" key inside it. Also, make sure that the URLs of the images are entered as strings. Here’s how your corrected code should look:

    messages=[
        {
            "role": "user",
            "content": [
                {
                    "type": "text",
                    "text": "Describe these images.",
                },
                {
                    "type": "image_url",
                    "image_url": "ENTER_URL_HERE",  # Ensure this is a string.
                },
            ]
        }
    ]

With these changes, your API request should be processed correctly as long as the image URLs are accessible and valid. Hope this helps!

Hey, thanks for the reply. I just tried running this:

from openai import OpenAI

client = OpenAI()
response = client.chat.completions.create(
model=“gpt-4-vision-preview”,

messages=[
{
“role”: “user”,
“content”: [
{
“type”: “text”,
“text”: “Describe these images.”,
},
{
“type”: “image_url”,
“image_url”: “ENTERED_MY_JPG_URL_HERE”,
},
]
}
],
max_tokens=300,
)

print(response.choices[0])

But got the error message:

Choice(finish_reason=None, index=0, message=ChatCompletionMessage(content=“Sorry, I can’t help with that request.”, role=‘assistant’, function_call=None, tool_calls=None), finish_details={‘type’: ‘stop’, ‘stop’: ‘<|fim_suffix|>’})


I am seeing the same error. The first request I made correctly described the images, but subsequent attempts with the same prompt all return a string like “Sorry, I can’t provide any details on these images.”. The images are SFW.

2 Likes

Having OpenAI download images from a URL themselves is inherently problematic. They can be seen as an IP to block, and also, they respect and are overly concerned with robots.txt. Also the image URL can get served a html landing page or wrapper, and can depend on a login. And the image just might not be tolerated, like a webp in a png.

Here’s a script to submit your image file, and see if the AI reports problems. I enhanced for problem-solving. If solved, do your own image-grabbing or file serving.

'''OpenAI gpt-4-vision example script from image file
uses pillow to resize and make png: pip install pillow'''

import base64
from openai import OpenAI
from io import BytesIO
from PIL import Image

def encode_image(image_path, max_image=512):
    with Image.open(image_path) as img:
        width, height = img.size
        max_dim = max(width, height)
        if max_dim > max_image:
            scale_factor = max_image / max_dim
            new_width = int(width * scale_factor)
            new_height = int(height * scale_factor)
            img = img.resize((new_width, new_height))

        buffered = BytesIO()
        img.save(buffered, format="PNG")
        img_str = base64.b64encode(buffered.getvalue()).decode("utf-8")
        return img_str

client = OpenAI()
image_file = "myImage.jpg"
max_size = 512  # set to maximum dimension to allow (512=1 tile, 2048=max)
encoded_string = encode_image(image_file, max_size)

system_prompt = ("You are an expert at analyzing images with computer vision. In case of error, "
 "make a full report of the cause of: any issues in receiving, understanding, or describing images")
user = ("Describe the contents and layout of my image.")

apiresponse = client.chat.completions.with_raw_response.create(
    model="gpt-4-vision-preview",
    messages=[
        {"role": "system", "content": system_prompt},
        {
            "role": "user",
            "content": [
                {"type": "text", "text": user},
                {
                    "type": "image_url",
                    "image_url": {"url":
                        f"data:image/jpeg;base64,{encoded_string}"},
                },
            ],
        },
    ],
    max_tokens=500,
)
debug_sent = apiresponse.http_request.content
chat_completion = apiresponse.parse()
print(chat_completion.choices[0].message.content)
print(chat_completion.usage.model_dump())
print(
    "remaining-requests: "
    f"{apiresponse.headers.get('x-ratelimit-remaining-requests')}"
)

Also a resizer to cut down on your costs. It’s set to 512 for one tile, and 1024, the next logical step.

6 Likes

Nice, thanks! :pray: I ended up using images from the local directory as well instead of pulling them from URL’s. Good idea re-sizing.