What code would enable me to upload an image to a custom assistant via API? The vision Python script here is hard to modify to do so… Vision - OpenAI API
The linked “vision” guide is not applicable for Assistants, where you:
- upload to file storage with purpose “vision”, receive ID
- create a user message with the file ID as part of the message content
- then manage and maintain the link from file to chat so your platform can clean up after itself after chat deletion or expiration.
how can you create a user message with the file ID as part of the message content? Surely, the message content will just be the text. I have tried passing the file if of the image as the content but I get no response then.
# create a new message
input_message = self.client.beta.threads.messages.create(
thread_id=self.assistant_thread.id,
role="user",
content=file_id or text or in there another parameter?
)
Sending a message is easy when it is just text.
The content field is more complex when it is “array”, which is required for images.
Here is a new API reference for you, since the current is so useless, neither human nor AI can understand.
Assistants Message Content
You have two different ways of sending content:
content - string - Just the text you send as user message, no additional parts.
or
content - array - An array (list) of content parts with a defined type, each can be of type
text
, or images can be passed with typeimage_url
orimage_file
. (Image types are only supported on Vision-compatible models).
Images for gpt-4 vision are only supported in user role messages.
Possible array objects for the content array
You can combine any of these object blocks to construct a message, interleaving text with images.
- Object - for passing plain text
type
: - string - Required - Alwaystext
text
: - string - Required - A language input for AI
and/or
- Object - for passing files for vision
type
- string - Required - Alwaysimage_file
.image_file
- object - Required
file_id
- string - Required - The file ID of the image, returned from uploading tofiles
. Set purpose=“vision” when uploading the file if you need to later display the file content from storage.detail
- string - Optional - Defaults to auto - Specifies the detail level of the image. low uses fewer tokens, you can opt in to high resolution using high (default is actually ALWAYS high).
and/or
- Object - for passing internet URLs for vision
type
- string - Required - Alwaysimage_url
.image_url
- object - Required
url
- string - Required - The internet location of a file.detail
- string - Optional - Defaults to auto
Examples
Here’s how you can structure the content
array to intersperse text with images. Each part of the message content is an object in the array that either contains text or an image file reference:
# Create a message content in the thread, using a image file
content = [
{
"type": "text",
"text": "What's in this image?"
},
{
"type": "image_file",
"image_file": {
"file_id": file_id,
"detail": "auto"
}
}
]
# Create an image message in the thread, using image URL
content = [
{
"type": "text",
"text": "Analyze these images."
},
{
"type": "text",
"text": "image file name: xGOspjW.jpeg"
},
{
"type": "image_url",
"image_url": {
"url": "https://i.imgur.com/xGOspjW.jpeg",
"detail": "low"
}
}
]
Messages can both be added to a thread, or you can add initial messages when you create a thread.
The Python example missing from documentation
- Having already obtained a thread id and uploaded a file
from openai import OpenAI
client = OpenAI()
thread_message = client.beta.threads.messages.create(
"thread_AO69S01Yd9uOY4ad4LXvEX6w01Yd",
role="user",
content=[
{
"type": "text",
"text": "What's in this image?"
},
{
"type": "image_file",
"image_file": {
"file_id": "file-UuzPKi0gb33DKi01VlAj76wsoyi1V",
"detail": "auto",
}
},
{
"type": "text",
"text": "What's in the second image?"
},
{
"type": "image_url",
"image_url": {
"url": "http://i.imgur.com/my_file.jpg"
"detail": "low"
}
}
]
)
This code snippet includes four parts in the content array:
- Text asking about the contents of the first image.
- An image file reference to “file-UuzPKi0gb33DKi01VlAj76wsoyi1V”
- Text asking about the contents of a second image.
- An image reference to an internet image URL
Tip: unless you have instructions in an assistant telling AI there is built-in vision capability, you may receive refusals or ignorance.
Also, those uploaded vision files don’t expire nor delete themselves!
Hi @_j ,
I submitted the request body below, and the files have been uploaded with the purpose of “vision.”
{
"role": "user",
"attachments": [
{
"file_id": "file-U4wnOiO9X5B2eAEUxuCfgYy7",
"purpose": "vision"
},
{
"file_id": "file-s6SbUoeSv4KWji6x0yNU6k5H",
"purpose": "vision"
}
],
"content": [
{
"type": "text",
"text": "Extract the details from images."
}
]
}
I noticed in the guide that the “tools” parameter is optional. However, I also tried using both the “tools” and “file_search” parameters. Since my files are in JPEG format, the file search did not accept the request again.
{
"error": {
"message": "Missing required parameter: 'attachments[0].tools'.",
"type": "invalid_request_error",
"param": "attachments[0].tools",
"code": "missing_required_parameter"
}
}
If you have any ideas, I would appreciate them.
Hi @onurgulay !
I noticed that my guide had errors.
It should now be correct, extensive, and reference-quality, all verified against the API by sending, something the API Reference documents about “create message” is not.
Here is significant code from my own asyncio Python library, showing:
- upload
- create thread - with text message content
- add to thread - file image and text
- add to thread - url image and text
- list messages
- delete
The only requirement is a file: my_image.png, and httpx module (which openai also uses)
MEGA_DEMO
''' Assistants - vision file and URL demonstration '''
import os
import httpx
import asyncio
import aiofiles
import json
from typing import List, Optional, Dict
API_KEY = os.environ.get("OPENAI_API_KEY")
if not API_KEY:
raise ValueError("OPENAI_API_KEY environment variable not set")
HEADERS = {
"OpenAI-Beta": "assistants=v2",
"Authorization": f"Bearer {API_KEY}"
}
BASE_URL = "https://api.openai.com/v1"
async def upload_file(file_path: str, purpose="vision") -> dict:
"""
Uploads a file to the OpenAI API for use with assistants vision.
:param file_path: Path to the file to upload.
:return: JSON response containing file details.
"""
url = f"{BASE_URL}/files"
async with httpx.AsyncClient() as client:
async with aiofiles.open(file_path, 'rb') as f:
file_content = await f.read()
files = {
'file': (os.path.basename(file_path), file_content),
'purpose': (None, purpose),
}
response = await client.post(url, headers=HEADERS, files=files)
response.raise_for_status()
return response.json()
async def create_thread(
messages: Optional[List[Dict[str, str]]] = None
) -> dict:
"""
Creates a new thread.
:param messages: Optional list of messages to initialize the thread with.
:return: JSON response containing thread details.
"""
url = f"{BASE_URL}/threads"
body = {}
if messages:
body["messages"] = messages
async with httpx.AsyncClient() as client:
response = await client.post(url, headers=HEADERS, json=body)
response.raise_for_status()
print(response.json())
return response.json()
async def delete_thread(thread_id: str) -> dict:
"""
Deletes a specific thread.
:param thread_id: The ID of the thread to delete.
:return: JSON response confirming deletion.
"""
url = f"{BASE_URL}/threads/{thread_id}"
async with httpx.AsyncClient() as client:
response = await client.delete(url, headers=HEADERS)
response.raise_for_status()
return response.json()
async def create_message(
thread_id: str,
role: str,
content: str
) -> dict:
"""
Creates a new message within a thread.
:param thread_id: The ID of the thread.
:param role: The role of the message sender ('user' or 'assistant').
:param content: The content of the message.
:return: JSON response containing the created message.
"""
url = f"{BASE_URL}/threads/{thread_id}/messages"
body = {
"role": role,
"content": content
}
async with httpx.AsyncClient() as client:
response = await client.post(url, headers=HEADERS, json=body)
response.raise_for_status()
return response.json()
async def list_messages(
thread_id: str,
limit: int = 20,
order: str = "desc",
after: Optional[str] = None,
before: Optional[str] = None,
run_id: Optional[str] = None
) -> dict:
"""
Lists messages within a specific thread.
:param thread_id: The ID of the thread.
:param limit: Number of messages to retrieve (1-100).
:param order: Sort order, 'asc' or 'desc'.
:param after: Cursor for pagination.
:param before: Cursor for pagination.
:param run_id: Filter messages by run ID.
:return: JSON response containing a list of messages.
"""
url = f"{BASE_URL}/threads/{thread_id}/messages"
params = {
"limit": limit,
"order": order
}
if after:
params["after"] = after
if before:
params["before"] = before
if run_id:
params["run_id"] = run_id
async with httpx.AsyncClient() as client:
response = await client.get(url, headers=HEADERS, params=params)
response.raise_for_status()
return response.json()
if __name__ == "__main__":
async def main():
"""demonstrate a new thread with image messages"""
# Upload a file
print("uploading...")
file_path = "my_image.png" # Replace with your file path
file_response = await upload_file(file_path)
print("File uploaded:")
print(json.dumps(file_response, indent=2))
file_id = file_response["id"]
# file_id = "file-blahblah" # existing file instead
# Create a new thread with initial messages (no role parameter)
print("creating thread")
thread = await create_thread(
messages=[
{
"role": "user",
"content": "You will use your GPT-4 vision.",
},
]
)
print("Thread created:")
print(json.dumps(thread, indent=2))
thread_id = thread["id"]
# Create a message in the thread, using a image file
content = [
{
"type": "text",
"text": "What's in this image?"
},
{
"type": "image_file",
"image_file": {
"file_id": file_id,
"detail": "auto"
}
}
]
message = await create_message( # uses role parameter
thread_id,
role="user",
content=content
)
print("Message created:")
print(json.dumps(message, indent=2))
# Create another image message in the thread, using image URL
content = [
{
"type": "text",
"text": "Answer about that and this image too."
},
{
"type": "image_url",
"image_url": {
"url": "https://i.imgur.com/xGOspjW.jpeg",
"detail": "low"
}
}
]
message = await create_message(
thread_id,
role="user",
content=content
)
print("Message created:")
print(json.dumps(message, indent=2))
# List the thread messages
print("retrieving message list...")
messages = await list_messages(thread_id)
print("Thread messages list:")
print(json.dumps(messages, indent=2))
# Clean up after demo
delete_status = await delete_thread(thread_id)
print(delete_status)
asyncio.run(main())
@_j Thanks for your response. Have a fantastic working day!