I’m using the gpt-4-vision-preview API. I’m passing images to the API using data URLs. I’m also passing all the previous messages (message history).
It recognizes the image at first and describes it. If I follow up with another question about the image it will say it cannot read images or to provide it again. But if i ask again, it will recognize the image.
So it seems like the model can look back in the chat and read the image again but it is not always aware of it. Is this normal? Is it just a matter of giving it a system message so it does not to do this?
On the web version, it has a tendency to straight up forget that it can analyze images right in the middle of a conversation. At which point you have to remind it gently that it can in fact do it and to “just try it”.
It helps if your image has some sort of unique code to identify it and use that to refer to it. ie: analyze image a5.
Vision analysis costs you money every “look”.
If you are sending a URL, you are also relying on someone else to download and insert the image before the AI can answer.
Try putting the base64 encoded image (on your dime of course) into past conversation history turns for the user just as it was originally asked, count the non-stream tokens and see they are being “visioned” again even when not in the most present chat, and then go about asking about that chat history.
Hi! What do you mean by unique code for identifying? We cannot add any metadata to images. Just the plain image and text along with it.
Yes this is what I’m doing. Sometimes GPT remembers it can analyze the image, sometimes not.