I tried to feed in an image link which was posted before 14 March, it knew what is it about. When I upload a new image online and then feed it to the gpt4 api, it said it doesn’t have ability to read the image. So I speculate gpt4 model did scanning all images online and so as their links, so it can’t read the new image. That is, it doesn’t really have the ability to read image.
This post was flagged by the community and is temporarily hidden.
Mostly because the feature status is “coming soon…,”
This would be an incorrect assumption. The entire point of using images in a multi-modal solution is to give it images the model has never seen before nor was it trained with them – and then do something based on those pixels.
Use the DALL-E API. It’s impressive for text to image.
Or use CLIP to go from image to text.