Can Assistants API understand image files uploaded?

pppyyy · November 21, 2023, 6:05am

I tried to send a screenshot to the agent and the agent responded with “It appears that there has been an issue with accessing the uploaded screenshot”.

Is this by design or am I missing something?

antont · November 21, 2023, 6:59am

The Vision docs say, at https://platform.openai.com/docs/guides/vision

Note that the Assistants API does not currently support image inputs.

Elsewhere there was a note that Retrieval could be used for it, but at least in the listing none of the image formats are supported for Retrieval. I tried it anyway yesterday and nothing worked.

So I’m now back to just using the vision model without assistants, keeping the thread in my backend and re-sending the messages for continued discussion, which I guess is what the Assistants API does internally anyway.

Poaconstrictor · November 21, 2023, 10:36am

I’m having the same issue trying to build an assistant capable of handling Images as inputs. Very disappointing that the assistants aren’t currently capable of handling images as inputs.

_j · November 21, 2023, 8:56pm

You can build a function that makes image recognition (from any service) useful.

Let’s say you have MiniGPT running on your server to label areas of an image with contents.

Then you just need to provide your normal chat completions AI chatbot you programmed with a function specification to call that. Obviously you can’t send an actual image, but a user could supply a URL via chat or by your webpage dialog that directly interfaces with the function.

farscape · December 11, 2023, 1:04am

Yeah, I ended up using Vision API through Chat completions API for my use case.

So the user would basically upload the image on the frontend, the frontend would send the image to the backend, I would then upload it to my server, pass the URL to the Assistant, which in turn passes the user message and the URL to the Chat API through function calling. Chat API then describes the image using Vision, returning the response to the Assistant, which returns it back to the user.

If anyone is interested I wrote a short blog post on how that can be done in Laravel but the general logic is applicable in any language:

Using Laravel to interact with Assistants API and Vision

aiman1 · December 16, 2023, 3:32pm

Please share the code. I have been trying to do the same using chat completions but it doesn’t support thread id ig.

_j · December 16, 2023, 3:53pm

In chat completions, you can simply include an image as part of the user input, only when using the AI model gpt-4-vision-preview (which you can switch to just when an image to analyze has been furnished.)

Conversation history, you manage yourself by giving the AI some of the past chat exchanges before the latest user input.

You can read more about vision use at https://platform.openai.com/docs/guides/vision, already posted, or by continuing to expand the API Reference for chat completion messages into user messages.

aiman1 · December 16, 2023, 4:19pm

Why do I have to provide some chat content? Can’t we have a continuous discussion on the same ID?

aiman1 · December 16, 2023, 4:23pm

Suppose I sent an image input for description. I got an ID in response. Can I use that id to continue that discussion and then I don’t need to provide past chat content to make it understand previous history, just like we do using ChatGPT?

_j · December 16, 2023, 8:57pm

The chat completions endpoint does not maintain threads or conversations, and the id that is returned is just used internally by OpenAI. It is stateless, memoryless, and you get direct access to the AI model, which is loaded with all the input text by you that makes it generate an answer each time.

rsokolova · September 14, 2024, 3:02pm

Sadly a year later still have the same issue. It can upload the image, but it’s not analyzing the image.

akashlomas · September 28, 2024, 10:54am

Hi,

First of all, these posts helped me a lot as I was also struggling to get Image analyzed with Assistant APIs.

This is what worked for me (Python Code);

#function to upload a file for image analysis
def upload_file_to_openai(client, file_path, purpose):
    try:
        with open(file_path, "rb") as file:
            response = client.files.create(
                file=file,
                purpose="vision"
            )
        return response
    except Exception as e:
        print(f"An error occurred while uploading the file: {str(e)}")
        return None

#create message
def add_message_to_thread(client, api_option, model, thread_id, content):
    try:
        if api_option == "completions":
            return client.chat.completions.create(
                model=model, #gpt-4o-mini
                messages=[
                    {
                    "role": "user",
                    "content": [{"type": "text", "text": "Whats in this image?"},
                                {"type": "image_url","image_url": 
                                 { "url": "https://", "detail": "high"},},],}])
        
        if api_option == "assistant":
            return client.beta.threads.messages.create(
                thread_id = thread_id,
                role = "user",
                content = [{"type": "text", "text":content},
                            {"type": "image_file", 
                            "image_file": {"file_id": "file-mnnb6RFsABCo0vBLae7c2rtQ", "detail": "low"}}])
        
        # manage wrong api option
    except Exception as e:
        print(f"Error adding message to thread: {e}")
        return None

Hope this helps!

Thanks

Topic		Replies	Views
Asisstant API for querying image API	5	445	June 25, 2024
Inputting an image in the Assistant API using the new vision model API gpt-4-vision , assistants-api	9	5027	July 16, 2024
Integrating Vision with Assistant API API assistants-api	11	3343	May 16, 2024
Analyze image from open ai storage API gpt-4-vision , assistants-api	3	313	November 15, 2024
Chatgpt 4o API For Sending Both PDF and Images API	9	13170	February 12, 2025

Can Assistants API understand image files uploaded?

Related topics