When I search all results that come back are on turning a description into an image but I want to do the opposite. I want to start with an image and have GPT3 describe to me what the image is of or even better have it build a description with added content of the surrounding text (I am processing webpages).
Is this possible?
For that you need to use OpenAI’s open source CLIP model - you can test it on replicate rmokady/clip_prefix_caption – Run with an API on Replicate
Is it possible to use GPT-4 to describe images?
Welcome to the forum!
No, currently GPT04 can only deal with text. Imagine ingestion is a future release, no timeline for that yet.
So what we can do in order to describe an image using Open AI API? Do you suggest any third party tool ?
You might take a look at technology like BASIC-L and CoCa, there are lots of image classification models out there. A ready build one would be the Microsoft Image Processing API, there are others.