What code would enable me to upload an image to a custom assistant via API? The vision Python script here is hard to modify to do so… Vision - OpenAI API
The linked “vision” guide is not applicable for Assistants, where you:
- upload to file storage with purpose “vision”, receive ID
- create a user message with the file ID as part of the message content
- then manage and maintain the link from file to chat so your platform can clean up after itself after chat deletion or expiration.
how can you create a user message with the file ID as part of the message content? Surely, the message content will just be the text. I have tried passing the file if of the image as the content but I get no response then.
# create a new message
input_message = self.client.beta.threads.messages.create(
thread_id=self.assistant_thread.id,
role="user",
content=file_id or text or in there another parameter?
)
Sending a message is easy when it is just text.
The content field is more complex when it is “array”, which is required to attach an image file ID that has been uploaded to the API files storage.
API reference transcribed to something that doesn’t need a half-dozen clicks:
- content - string or array Required
- Text content - string - The text contents of the message.
- Array of content parts - array - An array of content parts with a defined type, each can be of type text or images can be passed with image_url or image_file. Image types are only supported on Vision-compatible models.
- Image file - object
- type - string Required - Always image_file.
- image_file - object Required
- file_id - string Required - The File ID of the image in the message content. Set purpose=“vision” when uploading the File if you need to later display the file content.
- detail - string Optional - Defaults to auto Specifies the detail level of the image if specified by the user. low uses fewer tokens, you can opt in to high resolution using high.
- Image URL - object
- type - string Required - The type of the content part.
- image_url - object Required - URL properties shown here
Here’s how you can structure the content
array to intersperse text with messages (harder to do with a chatbot input). Each part of the message is an object in the array that either contains text or an image file reference:
from openai import OpenAI
client = OpenAI()
thread_message = client.beta.threads.messages.create(
"thread_abc123",
role="user",
content=[
{
"type": "text",
"text": "What's in this image?"
},
{
"type": "image_file",
"image_file": {
"file_id": "file-1234image",
"purpose": "vision"
}
},
{
"type": "text",
"text": "What's in the second image?"
},
{
"type": "image_file",
"image_file": {
"file_id": "file-2345image",
"purpose": "vision"
}
}
]
)
This code snippet includes four parts in the content array:
- Text asking about the contents of the first image.
- An image file reference to “file-1234image” with a vision purpose for image processing.
- Text asking about the contents of a second image.
- Another image file reference to “file-2345image” also with a vision purpose.
Ensure that both image files are correctly uploaded and have the appropriate file IDs assigned as per your server or system’s management of files.
In this code, I’ve set the file_id
to “file-1234image”, assuming that’s replaced with the ID you receive back from uploading. Make sure the file has been uploaded and has a purpose set to "vision"
to ensure proper handling for image analysis or similar tasks.
Also, those uploaded files don’t delete themselves!