Image_url for gpt-4o api giving error "expected an object, but got a string instead.",

I ran the exact code given in the documentation for vision api.

But I got the below error. Same applies to gpt-4-turbo model. Am I missing something?

openai.BadRequestError: Error code: 400 - {'error': {'message': "Invalid type for 'messages[0].content[1].image_url': expected an object, but got a string instead.", 'type': 'invalid_request_error', 'param': 'messages[0].content[1].image_url', 'code': 'invalid_type'}}

Here is the code:


response = client.chat.completions.create(
    model="gpt-4o",
    messages=[
        {
            "role": "user",
            "content": [
                {"type": "text", "text": "What's in this image?"},
                {
                    "type": "image_url",
                    "image_url": "https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg",
                },
            ],
        }
    ],
    max_tokens=300,
)

print(response.choices[0])

Your code appears to be missing the object structure under the “image_url” key.
The correct format should look like this:

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[
        {
            "role": "user",
            "content": [
                {"type": "text", "text": "What’s in this image?"},
                {
                    "type": "image_url",
                    "image_url": {
                        "url": "https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg",
                        "detail": "high"
                    },
                },
            ],
        }
    ],
    max_tokens=300,
)
print(response.choices[0].message.content)

The “detail” key can also be set to “low,” but it appears it cannot be omitted.

I often make this mistake too, and it can be a bit confusing.
I hope this helps you even a little🙂

2 Likes

Thank you! that works.
But I ran into a new problem when using AWS S3 presigned URL which gave the same error. will explore further.

I get the same error when using s3 resigned URL with gpt-4o or gpt-4-turbo. No error with gpt-4-vision-preview. OP, have you found a solution?

Hi @ebobr , For me, the problem got solved when I did not include max_tokens in the request. But this is not an actual solution. Its a workaround we found. The problem with S3 bucket URLs still exist.

1 Like

J’ai eu la meme erreur. La documentation de l’api est erronée , elle correspond au model gpt-4-vision-preview :
from openai import OpenAI

client = OpenAI()

response = client.chat.completions.create(
model=“gpt-4-turbo”,
messages=[
{
“role”: “user”,
“content”: [
{“type”: “text”, “text”: “What’s in this image?”},
{
“type”: “image_url”,
“image_url”: “https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg”,
},
],
}
],
max_tokens=300,
)

print(response.choices[0])

Yes, it seems that the documentation isn’t always correct. Please refer to the method I mentioned above.

Le format des données renvoyé ne correspond plus à du json, voici un autre extrait de la documentation openai:

from openai import OpenAI
client = OpenAI()

audio_file = open(“speech.mp3”, “rb”)
transcript = client.audio.transcriptions.create(
model=“whisper-1”,
file=audio_file
)
{
“text”: “Imagine the wildest idea that you’ve ever had, and you’re curious about how it might scale to something that’s a 100, a 1,000 times bigger. This is a place where you can get to do that.”
}
Pour accéder à la réponse, auparavant on écrivait transcript[‘text’] (comme pour un dictionnaire) mais maintenant il faut écrire transcript.text : cela correspond plutot à l’attribut d’un objet d’une classe, bref on a récupère un objet et non du json …
J’ai l’impression que lors de la dernière grosse mise à jour d’openai l’année dernière , avec le passage de l’écriture du style openai.Audio.translate à client.audio.transcriptions.create , la mise à jour de la documentation n’a pas suivie et a fait un mixte des deux …

this solution save my day using GPT-4o , plus converting max_tokens to integer. Thank you

1 Like