Can Assistants API understand image files uploaded?

I tried to send a screenshot to the agent and the agent responded with “It appears that there has been an issue with accessing the uploaded screenshot”.

Is this by design or am I missing something?

1 Like

The Vision docs say, at

Note that the Assistants API does not currently support image inputs.

Elsewhere there was a note that Retrieval could be used for it, but at least in the listing none of the image formats are supported for Retrieval. I tried it anyway yesterday and nothing worked.

So I’m now back to just using the vision model without assistants, keeping the thread in my backend and re-sending the messages for continued discussion, which I guess is what the Assistants API does internally anyway.

I’m having the same issue trying to build an assistant capable of handling Images as inputs. Very disappointing that the assistants aren’t currently capable of handling images as inputs.

You can build a function that makes image recognition (from any service) useful.

Let’s say you have MiniGPT running on your server to label areas of an image with contents.

Then you just need to provide your normal chat completions AI chatbot you programmed with a function specification to call that. Obviously you can’t send an actual image, but a user could supply a URL via chat or by your webpage dialog that directly interfaces with the function.

1 Like