Upload image to assistant via API

ebennb · June 4, 2024, 6:36pm

What code would enable me to upload an image to a custom assistant via API? The vision Python script here is hard to modify to do so… Vision - OpenAI API

_j · June 4, 2024, 6:44pm

The linked “vision” guide is not applicable for Assistants, where you:

upload to file storage with purpose “vision”, receive ID
create a user message with the file ID as part of the message content
then manage and maintain the link from file to chat so your platform can clean up after itself after chat deletion or expiration.

louis1982 · July 4, 2024, 3:03pm

how can you create a user message with the file ID as part of the message content? Surely, the message content will just be the text. I have tried passing the file if of the image as the content but I get no response then.

        # create a new message
        input_message = self.client.beta.threads.messages.create(
            thread_id=self.assistant_thread.id,
            role="user",
            content=file_id or text or in there another parameter?
            
        )

_j · July 4, 2024, 4:56pm

Sending a message is easy when it is just text.

The content field is more complex when it is “array”, which is required for images.

Here is a new API reference for you, since the current is so useless, neither human nor AI can understand.

Assistants Message Content

You have two different ways of sending content:

content - string - Just the text you send as user message, no additional parts.

or

content - array - An array (list) of content parts with a defined type, each can be of type text, or images can be passed with type image_url or image_file. (Image types are only supported on Vision-compatible models).

Images for gpt-4 vision are only supported in user role messages.

Possible array objects for the content array

You can combine any of these object blocks to construct a message, interleaving text with images.

Object - for passing plain text

type: - string - Required - Always text

text: - string - Required - A language input for AI

and/or

Object - for passing files for vision

type - string - Required - Always image_file.

image_file - object - Required

file_id - string - Required - The file ID of the image, returned from uploading to files. Set purpose=“vision” when uploading the file if you need to later display the file content from storage.

detail - string - Optional - Defaults to auto - Specifies the detail level of the image. low uses fewer tokens, you can opt in to high resolution using high (default is actually ALWAYS high).

and/or

Object - for passing internet URLs for vision

type - string - Required - Always image_url.

image_url - object - Required

url - string - Required - The internet location of a file.

detail - string - Optional - Defaults to auto

Examples

Here’s how you can structure the content array to intersperse text with images. Each part of the message content is an object in the array that either contains text or an image file reference:

# Create a message content in the thread, using a image file
content = [
    {
        "type": "text",
        "text": "What's in this image?"
    },
    {
        "type": "image_file",
        "image_file": {
            "file_id": file_id,
            "detail": "auto"
        }
    }
]

# Create an image message in the thread, using image URL
content = [
    {
        "type": "text",
        "text": "Analyze these images."
    },
    {
        "type": "text",
        "text": "image file name: xGOspjW.jpeg"
    },
    {
        "type": "image_url",
        "image_url": {
            "url": "https://i.imgur.com/xGOspjW.jpeg",
            "detail": "low"
        }
    }
]

Messages can both be added to a thread, or you can add initial messages when you create a thread.

The Python example missing from documentation

Having already obtained a thread id and uploaded a file

from openai import OpenAI
client = OpenAI()

thread_message = client.beta.threads.messages.create(
    "thread_AO69S01Yd9uOY4ad4LXvEX6w01Yd",
    role="user",
    content=[
        {
            "type": "text",
            "text": "What's in this image?"
        },
        {
            "type": "image_file",
            "image_file": {
                "file_id": "file-UuzPKi0gb33DKi01VlAj76wsoyi1V",
                "detail": "auto",
            }
        },
        {
            "type": "text",
            "text": "What's in the second image?"
        },
        {
            "type": "image_url",
            "image_url": {
                "url": "http://i.imgur.com/my_file.jpg"
                "detail": "low"
            }
        }
    ]
)

This code snippet includes four parts in the content array:

Text asking about the contents of the first image.
An image file reference to “file-UuzPKi0gb33DKi01VlAj76wsoyi1V”
Text asking about the contents of a second image.
An image reference to an internet image URL

Tip: unless you have instructions in an assistant telling AI there is built-in vision capability, you may receive refusals or ignorance.

Also, those uploaded vision files don’t expire nor delete themselves!

onurgulay · October 28, 2024, 10:56am

Hi @_j ,

I submitted the request body below, and the files have been uploaded with the purpose of “vision.”

{
    "role": "user",
    "attachments": [
        {
            "file_id": "file-U4wnOiO9X5B2eAEUxuCfgYy7",
            "purpose": "vision"
        },
        {
            "file_id": "file-s6SbUoeSv4KWji6x0yNU6k5H",
            "purpose": "vision"
        }
    ],
    "content": [
        {
            "type": "text",
            "text": "Extract the details from images."
        }
    ]
}

I noticed in the guide that the “tools” parameter is optional. However, I also tried using both the “tools” and “file_search” parameters. Since my files are in JPEG format, the file search did not accept the request again.

{
    "error": {
        "message": "Missing required parameter: 'attachments[0].tools'.",
        "type": "invalid_request_error",
        "param": "attachments[0].tools",
        "code": "missing_required_parameter"
    }
}

If you have any ideas, I would appreciate them.

_j · October 29, 2024, 1:00am

Hi @onurgulay !

I noticed that my guide had errors.

It should now be correct, extensive, and reference-quality, all verified against the API by sending, something the API Reference documents about “create message” is not.

Here is significant code from my own asyncio Python library, showing:

upload
create thread - with text message content
add to thread - file image and text
add to thread - url image and text
list messages
delete

The only requirement is a file: my_image.png, and httpx module (which openai also uses)

MEGA_DEMO

''' Assistants - vision file and URL demonstration '''

import os
import httpx
import asyncio
import aiofiles
import json
from typing import List, Optional, Dict

API_KEY = os.environ.get("OPENAI_API_KEY")
if not API_KEY:
    raise ValueError("OPENAI_API_KEY environment variable not set")

HEADERS = {
    "OpenAI-Beta": "assistants=v2",
    "Authorization": f"Bearer {API_KEY}"
}
BASE_URL = "https://api.openai.com/v1"

async def upload_file(file_path: str, purpose="vision") -> dict:
    """
    Uploads a file to the OpenAI API for use with assistants vision.

    :param file_path: Path to the file to upload.
    :return: JSON response containing file details.
    """
    url = f"{BASE_URL}/files"
    async with httpx.AsyncClient() as client:
        async with aiofiles.open(file_path, 'rb') as f:
            file_content = await f.read()

        files = {
            'file': (os.path.basename(file_path), file_content),
            'purpose': (None, purpose),
        }

        response = await client.post(url, headers=HEADERS, files=files)
        response.raise_for_status()
        return response.json()

async def create_thread(
    messages: Optional[List[Dict[str, str]]] = None
) -> dict:
    """
    Creates a new thread.

    :param messages: Optional list of messages to initialize the thread with.
    :return: JSON response containing thread details.
    """
    url = f"{BASE_URL}/threads"
    body = {}
    if messages:
        body["messages"] = messages

    async with httpx.AsyncClient() as client:
        response = await client.post(url, headers=HEADERS, json=body)
        response.raise_for_status()
        print(response.json())
        return response.json()

async def delete_thread(thread_id: str) -> dict:
    """
    Deletes a specific thread.

    :param thread_id: The ID of the thread to delete.
    :return: JSON response confirming deletion.
    """
    url = f"{BASE_URL}/threads/{thread_id}"
    async with httpx.AsyncClient() as client:
        response = await client.delete(url, headers=HEADERS)
        response.raise_for_status()
        return response.json()

async def create_message(
    thread_id: str,
    role: str,
    content: str
) -> dict:
    """
    Creates a new message within a thread.

    :param thread_id: The ID of the thread.
    :param role: The role of the message sender ('user' or 'assistant').
    :param content: The content of the message.
    :return: JSON response containing the created message.
    """
    url = f"{BASE_URL}/threads/{thread_id}/messages"
    body = {
        "role": role,
        "content": content
    }

    async with httpx.AsyncClient() as client:
        response = await client.post(url, headers=HEADERS, json=body)
        response.raise_for_status()
        return response.json()


async def list_messages(
    thread_id: str,
    limit: int = 20,
    order: str = "desc",
    after: Optional[str] = None,
    before: Optional[str] = None,
    run_id: Optional[str] = None
) -> dict:
    """
    Lists messages within a specific thread.

    :param thread_id: The ID of the thread.
    :param limit: Number of messages to retrieve (1-100).
    :param order: Sort order, 'asc' or 'desc'.
    :param after: Cursor for pagination.
    :param before: Cursor for pagination.
    :param run_id: Filter messages by run ID.
    :return: JSON response containing a list of messages.
    """
    url = f"{BASE_URL}/threads/{thread_id}/messages"
    params = {
        "limit": limit,
        "order": order
    }
    if after:
        params["after"] = after
    if before:
        params["before"] = before
    if run_id:
        params["run_id"] = run_id

    async with httpx.AsyncClient() as client:
        response = await client.get(url, headers=HEADERS, params=params)
        response.raise_for_status()
        return response.json()


if __name__ == "__main__":

    async def main():
        """demonstrate a new thread with image messages"""

        # Upload a file
        print("uploading...")
        file_path = "my_image.png"  # Replace with your file path
        file_response = await upload_file(file_path)
        print("File uploaded:")
        print(json.dumps(file_response, indent=2))
        file_id = file_response["id"]
        
        # file_id = "file-blahblah"  # existing file instead

        # Create a new thread with initial messages (no role parameter)
        print("creating thread")
        thread = await create_thread(
            messages=[
                {
                    "role": "user",
                    "content": "You will use your GPT-4 vision.",
                },
            ]
        )
        print("Thread created:")
        print(json.dumps(thread, indent=2))
        thread_id = thread["id"]

        # Create a message in the thread, using a image file
        content = [
            {
                "type": "text",
                "text": "What's in this image?"
            },
            {
                "type": "image_file",
                "image_file": {
                    "file_id": file_id,
                    "detail": "auto"
                }
            }
        ]

        message = await create_message(  # uses role parameter
            thread_id,
            role="user",
            content=content
        )
        print("Message created:")
        print(json.dumps(message, indent=2))

# Create another image message in the thread, using image URL
        content = [
            {
                "type": "text",
                "text": "Answer about that and this image too."
            },
            {
                "type": "image_url",
                "image_url": {
                    "url": "https://i.imgur.com/xGOspjW.jpeg",
                    "detail": "low"
                }
            }
        ]

        message = await create_message(
            thread_id,
            role="user",
            content=content
        )
        print("Message created:")
        print(json.dumps(message, indent=2))


        # List the thread messages
        print("retrieving message list...")
        messages = await list_messages(thread_id)
        print("Thread messages list:")
        print(json.dumps(messages, indent=2))

        # Clean up after demo
        delete_status = await delete_thread(thread_id)
        print(delete_status)

    asyncio.run(main())

onurgulay · October 30, 2024, 8:42am

@_j Thanks for your response. Have a fantastic working day!

Topic		Replies	Views
Can Assistants API understand image files uploaded? API	11	11006	September 28, 2024
There is no available documentation for the Assistants V2 API Documentation api	10	2518	June 5, 2024
Using images to discuss with an assistant API	14	10338	September 14, 2024
Need help with Assistant (uploading file and getting response back) API assistants-api	6	2102	February 16, 2024
How to send base64 images to Assistant API? API gpt-4 , chat-with-images , assistants-api , gpt-4o	18	24902	September 18, 2024

Upload image to assistant via API

Assistants Message Content

Possible array objects for the content array

Examples

The Python example missing from documentation

Related topics