Inputting an image in the Assistant API using the new vision model

Hello,

Two days ago, the new “gpt-4-turbo-2024-04-09” model was released, finally allowing vision capabilities to work alongside function calling.

I originally assumed that this change would allow for the Assistants API to be able to retrieve images - especially because the original disclaimer that the Assistants API doesn’t have vision capabilities seems to be gone. (Unless I’m misremembering?). And also because, well, the Assistants API has access to this model.

But no matter how hard I try, I can’t seem to properly input an image to the assistant, and there is still no documentation on this in the API reference.

Is this a planned feature that should roll out soon, or has it been implemented but not documented yet somehow?

At the top or the sidebar of this forum:

  • Click “documentation”;
  • In the documentation sidebar, click “Vision”;
  • Read passages such as this:

GPT-4 Turbo with Vision allows the model to take in images and answer questions about them. … Previously, the model has sometimes been referred to as GPT-4V or gpt-4-vision-preview in the API.

  • Proceed to reading about how a user message with an image is placed.

I used another account using API, yes it’s very new. I suspect this feature is the reason why my chatgpt account access to GPT 4 was delisted -_-

Never mind then, it seems that the disclaimer is still there, I was simply not looking in the right place:

Plese note that the Assistants API does not currently support image inputs.

Funny typo too!

Hopefully that gets added in soon.

2 Likes