API 4o Assistant thread unable to take base64 image as input

gmudcxr · October 17, 2024, 3:35pm

I am having an issue where the assistant is unable to take base64 image as input. It seems more like a syntax error, because I am using image_url to pass the image. Using image_url for images in buffer seems to be the only method. This syntax works with the client but not assistant, could anyone help me out? What is the correct way to do this?

thread = client.beta.threads.create(
  messages=[
    {
      "role": "user",
      "content": [
        {
          "type": "text",
          "text": "What is the difference between these images?"
        },
        {
          "type": "image_url",
          "image_url": {"url": f"data:image/jpeg;base64,{base64_image}"}
        },
        {
          "type": "image_url",
          "image_url": {"url": "https://www.google.com/images/branding/googlelogo/2x/googlelogo_color_272x92dp.png"}
        },
      ],
    }
  ]
)

The error is

openai.BadRequestError: Error code: 400 - {‘error’: {‘message’: “Invalid ‘messages[0].content[1].image_url.url’. Expected a valid URL, but got a value with an invalid format.”, ‘type’: ‘invalid_request_error’, ‘param’: ‘messages[0].content[1].image_url.url’, ‘code’: ‘invalid_value’}}

_j · October 17, 2024, 6:02pm

The assistant thread is unable to take base64 as input.

True.

The method to provide an image file is to:

upload it to the files endpoint, with purpose ‘assistants’, and receive the file ID.
use file ID with the image_file object within the user content, which can be seen expanding the API reference.

I hope this gives a path to success and happiness.

gmudcxr · October 18, 2024, 2:42pm

Thank you for putting me on the path to success and happiness.

Do you know if there is a more optimal way for inputting images? I am looking to have the assistant view about 150 images at once. I know it’s a lot but this is something I was able to do with client chat. I am trying assistant because I want it to have some uploaded knowledge.

_j · October 18, 2024, 4:04pm

There’s a more optimal way of inputting knowledge than images: text.

Instead of running images as input over and over again, just perform a task on each of extracting the text within for use. (I assume these are perhaps PDF pages or slides).

That will give clarity and less distraction from the task at hand. Then you can also use the file search on that data when uploaded as a text file.

Upload Optimization:

Unfortunately, while the files endpoint uses multipart/form-data, which can support multiple files to streamline network communications, the API is glad to ignore multiples being sent, and the return object doesn’t have a way to present multiple IDs.

You can upload in parallel, though.

Topic		Replies	Views
GPT 4o API - Using Assistant with ID to view Images API chatgpt , api , assistants-api , gpt-4o	0	179	October 17, 2024
400 while processing image file: Expected file type to be a supported format: .jpeg, .jpg, .png, .gif, .webp but got none Bugs gpt-4 , api , assistants-api	0	594	November 22, 2024
Cannot send image_url to gpt-4o API	4	4273	October 2, 2024
Files API issues with base64images API api	1	160	October 15, 2024
Can Assistants API understand image files uploaded? API	11	11366	September 28, 2024

API 4o Assistant thread unable to take base64 image as input

Related topics