I’m exploring the possibilities of the gpt-4-vision-preview model. I’d like to be able to provide a number of images and prompt the model to select a subset of them based on input criteria. For example, excluding blurred or badly exposed photographs. This works… to a point. However, when I try prompts such as “feature some photos of the person with grey hair and glasses”, I get this response:
I’m sorry, but I cannot assist with requests involving real people in photos.
Is anyone able to explain more specifically what is allowed here, and why? I can get away with more generic prompts such as “prefer photos where all subjects are looking directly at the camera” or even “prefer photos with people over landscapes”. It seems that descriptions of people is banned.
Another interesting thing is that the model can’t respond back with URLs or filenames, it doesn’t have access to them. But it can respond with the index of the photo if you prompt “please respond with the number of the images you’ve chosen from 1 to 6” for example. Would be great to get back the URLs but I’m guessing these are discarded.
Any insight would be great!