Two days ago, the new “gpt-4-turbo-2024-04-09” model was released, finally allowing vision capabilities to work alongside function calling.
I originally assumed that this change would allow for the Assistants API to be able to retrieve images - especially because the original disclaimer that the Assistants API doesn’t have vision capabilities seems to be gone. (Unless I’m misremembering?). And also because, well, the Assistants API has access to this model.
But no matter how hard I try, I can’t seem to properly input an image to the assistant, and there is still no documentation on this in the API reference.
Is this a planned feature that should roll out soon, or has it been implemented but not documented yet somehow?
GPT-4 Turbo with Vision allows the model to take in images and answer questions about them. … Previously, the model has sometimes been referred to as GPT-4V or gpt-4-vision-preview in the API.
Proceed to reading about how a user message with an image is placed.