Asisstant API for querying image

With Assistant API, document can be uploaded for RAG. How can I use image as in vision with Assistant API?

In ChatGPT Plus, I created a custom chat where documents and images can be uploaded, and can be queried later. How can that achieved using API?

You can do that by:

  1. uploading the image to the API file storage, with purpose “vision”
  2. including the file ID as a part of a user message you add to a thread.

The API reference for assistants → messages has the format for passing user messages when you expand the content section.

Thanks Jay. It works well for me.
This is the code in Nodejs - may be beneficial for others too.

const imageFileId = “file-xxxxxxxxxxxxxx”

const actualContent = [{ type: ‘text’, text: prompt }, { type: ‘image_file’, image_file : {file_id: imageFileId} } ]
const threadMessages = await openai.beta.threads.messages.create(
myThreadId,
{ role: “user”, content: actualContent },

);

1 Like

uploading image a part of message worked back then, now it always failed even in the playground.

That is because OpenAI’s assistant product, and the developer outreach and support (if you aren’t a blog-worthy partner), is franky, a turd, that has taken half a year to have a mere sugar-coating applied.

Vision has been broken by OpenAI for going on a week.

OpenAI’s blog is now not product announcements and developments but “success stories”. Quotes like: “Driven by its mission to remove tedious tasks from developers’ workflows, JetBrains incorporated OpenAI’s API into its AI Assistant product.”

To be such a partner is apparently allowing OpenAI to make up stories about your satisfaction and the quality of their product under your name. Or simply that no sane developer would rely on OpenAI Assistants, a low-code generic “solution” that maximizes code use while minimizing your control, and only attempts to solve a dumb middle-manager problem, “how do I chat about my PDFs”.