GPT-4 with Vision prompts involving people

I’m exploring the possibilities of the gpt-4-vision-preview model. I’d like to be able to provide a number of images and prompt the model to select a subset of them based on input criteria. For example, excluding blurred or badly exposed photographs. This works… to a point. However, when I try prompts such as “feature some photos of the person with grey hair and glasses”, I get this response:

I’m sorry, but I cannot assist with requests involving real people in photos.

Is anyone able to explain more specifically what is allowed here, and why? I can get away with more generic prompts such as “prefer photos where all subjects are looking directly at the camera” or even “prefer photos with people over landscapes”. It seems that descriptions of people is banned.

Another interesting thing is that the model can’t respond back with URLs or filenames, it doesn’t have access to them. But it can respond with the index of the photo if you prompt “please respond with the number of the images you’ve chosen from 1 to 6” for example. Would be great to get back the URLs but I’m guessing these are discarded.

Any insight would be great!

1 Like

It gets really iffy about the word people or person. One of the DALL-E 3 developers stated regarding how strict the content policy is that they would slowly make it more lenient as it goes on. I wonder if it is going to be the same for GPT-4-V, but also remember that GPT-4-V is still in preview so many things about it are bound to change.

1 Like