With Assistant API, document can be uploaded for RAG. How can I use image as in vision with Assistant API?
In ChatGPT Plus, I created a custom chat where documents and images can be uploaded, and can be queried later. How can that achieved using API?
You can do that by:
- uploading the image to the API file storage, with purpose âvisionâ
- including the file ID as a part of a user message you add to a thread.
The API reference for assistants â messages has the format for passing user messages when you expand the content section.
Thanks Jay. It works well for me.
This is the code in Nodejs - may be beneficial for others too.
const imageFileId = âfile-xxxxxxxxxxxxxxâ
const actualContent = [{ type: âtextâ, text: prompt }, { type: âimage_fileâ, image_file : {file_id: imageFileId} } ]
const threadMessages = await openai.beta.threads.messages.create(
myThreadId,
{ role: âuserâ, content: actualContent },
);
That is because OpenAIâs assistant product, and the developer outreach and support (if you arenât a blog-worthy partner), is franky, a turd, that has taken half a year to have a mere sugar-coating applied.
Vision has been broken by OpenAI for going on a week.
OpenAIâs blog is now not product announcements and developments but âsuccess storiesâ. Quotes like: âDriven by its mission to remove tedious tasks from developersâ workflows, JetBrains incorporated OpenAIâs API into its AI Assistant product.â
To be such a partner is apparently allowing OpenAI to make up stories about your satisfaction and the quality of their product under your name. Or simply that no sane developer would rely on OpenAI Assistants, a low-code generic âsolutionâ that maximizes code use while minimizing your control, and only attempts to solve a dumb middle-manager problem, âhow do I chat about my PDFsâ.