GPT-4-turbo vision API recognizes image_url as base64 encoded image data

Provided is the section of my typescript code. s3ImageURL and resolution variables have valid values.
s3ImageURL looks like https://s3.ap-northeast-2.amazonaws.com/bucketName/fileName
resolution - “auto” | “low” | “high”

    const response = await openai.chat.completions.create({
      model: 'gpt-4-turbo',
      messages: [
        {
          role: 'user',
          content: [
            {
              type: 'text',
              text: 'What’s in this image?'
            },
            {
              type: 'image_url',
              image_url: {
                url: s3ImageURL,
                detail: resolution
              }
            }
          ]
        }
      ],
      max_tokens: maxTokens
    });

I am using gpt-4-turbo to describe an image stored in AWS s3 bucket (public). I pass the correct s3 URL but the API returns an error saying

Error: 400 Invalid image URL: 'messages[0].content[1].image_url.url'. Expected a base64-encoded data URL with an image MIME type (e.g. 'data:image/png;base64,aW1nIGJ5dGVzIGhlcmU='), but got a value without the 'data:' prefix.

I don’t know why it keeps recognizing the image URL as encoded data. Is there any character in the URL that I need to escape? I do not want to download the image locally and convert it to base64 data.

Additional context:
When I run the above section of code in a function from a cmd, it works fine. It successfully describes the image from that imageURL. However, when I call the same function in the express router function, it keeps recognizing imageURL as base64-encoded data URL for some reason and fails.

3 Likes

did you figure this out? I am running into the same problem, passing an image_url keeps giving an error saying it’s expecting a base64 image

1 Like

In my case the max_tokens property value was a string and not an integer. And this was the reason why “Error: 400 Invalid image URL: 'messages[0].content[1].image_u…”. :expressionless:

1 Like

you mean you were passing “max_tokens” : “300” instead of “max_tokens”: 300? because that’s what i am doing. (i ended up base64 encoding to bypass my issue)

1 Like

Sorry for the late response. I changed my function to use base64 encoding as I could not figure out why. Later, I tried strictly typing my parameters(resolution, maxTokens), and it resolved the issue. Make sure all your parameter values are the correct type if you are using typescript.

1 Like

this solution save my day using GPT-4o , converting max_tokens to integer. Thank you

1 Like

for those who come across this question, this might have nothing to do with the image_url.
check that no unexpected values are passed to the api. for example, check that you are not passing null where it is not necessary.

Did anyone get a resolution to this? I’m getting the same error using this :

role: "user",
              content: [
                {
                  type: "text",
                  text: "Here is an image. Please provide a description",
                },
                {
                  type: "image_url",
                  image_url: { url: `${S3imgURL}` }, 
                },
              ],

And I’m getting this error :

OpenAI Error Response Data: {“error”: {“code”: “invalid_value”, “message”: “Invalid image URL: ‘messages[1].content[1].image_url.url’. Expected a base64-encoded data URL with an image MIME type (e.g. ‘data:image/png;base64,aW1nIGJ5dGVzIGhlcmU=’), but got a value without the ‘data:’ prefix.”, “param”: “messages[1].content[1].image_url.url”, “type”: “invalid_request_error”}}

AIUI using Base64 encoding will work BUT cost a lot more in tokens. Is that right?

It literally tells you right in the error message how to properly construct the url field.

Base64 encoded images cost the same as were that image to be hosted somewhere.

The API documentation shows that if you’re passing a URL, you don’t need the mime type or for it to be base64 encoded:

https://platform.openai.com/docs/guides/images?api-mode=responses

Yep, documentation is wrong. Here’s a pin in a reply I’ve been meaning to get to for developing a comprehensive “vision,files:wot?!” guide, and their documentation and their current issues needing correction, across Responses, Chat Completions, and Assistants - which are all different and also involve SDKs.



Thanks for the update.

I expect now SDK doesn’t send what the API Reference indicates:

but what actually works…

        {
          "type":"file",
          "file":{
            "file_data":"data:application/pdf;base64,JVBERi0xLjUNJeLjz9MN...",
            "filename":"my_knowledge.pdf"
          }
        }

(this is the case for file inputs that are a different type than vision inputs)

(here’s where my SDK dump goes…)