GPT-4-Vision detect NSFW images

hugo.maugey · November 21, 2023, 10:42am

Hi there,
I’m trying to ask GPT to detect NSFW images based on following prompt :

Your role is to moderate photos posted by users. Photos must not contain any of the following: 1. Sexual acts or masturbation 2. Erect male genitalia 3. Close-ups of genitalia or anus 4. Objects with sexual connotations (sex toys). If the photo contains prohibited features, answer PORNOGRAPHY, otherwise answer ARTISTIC.

But get this answer : “Your input image may contain content that is not allowed by our safety system.”

I can understand when asking generation of nudes is prohibited but what about detection ?

Fusseldieb · November 21, 2023, 1:40pm

You might get your account banned if you continue.

GPT4V is not meant for that. It tries to interpret the image as it is. They don’t have a visual moderation model as far as I know. Internally they do have one, but it’s not exposed. Your image goes through both and might flag your account.

There are pages and models for that, specifically. Search for “Image moderation platform” or similar.

Maybe OpenAI will expose the content moderation model, but for now, you’ll need to search elsewhere for that.

hugo.maugey · November 21, 2023, 2:25pm

Thanks for your answer ! Of course there is model for NSFW (Not Safe For Work) images but most of the time they return a probability of NSFW which is not appropriate in my case because here in France we are not as puritans as american people seeing a naked man or woman is definitely not pornography in out culture and I’m looking for a model which can understand the difference …

Fusseldieb · November 21, 2023, 2:48pm

Hmmm, that’s a good question. Try to give LLaVa a try: LLaVA (llava-vl.github.io)

Scroll down a little bit. There’s a demo so you can immediately test it.

Maybe it is smart enough. Just an idea. You still have to find a way to host it, but the model is open-source, so you’re half-way done.

danny-avila · November 21, 2023, 3:59pm

You will get banned for this and 2. You are much better off using Google’s SafeSearch, look up “google cloud safe search”, which is specifically designed for this with cutting edge tech.

hugo.maugey · November 21, 2023, 5:12pm

I’ve tested it with an image showing a naked woman viewed from back and the answer is PORNO which is not what I’m expecting with my specific prompt
Thanks for sharing this project !

hugo.maugey · November 21, 2023, 5:25pm

Thanks for the link to Google Cloud Vision Safe Search API ! Here is another link : cloud[dot]google[dot]com[slash]vision (can’t insert links …) which allowed me to test for the same picture (naked woman viewed from back) and the safe search answers “Very Likely” for Adult content category which doesn’t fit my use case.
We are a photography website allowing sexy pictures but not sexual and it seems hard for IA at the time to understand the difference

Fusseldieb · November 21, 2023, 5:33pm

Try asking differently. Asking it to describe the image. Then, put it inside GPT3.5 and ask it to classify further.

hugo.maugey · November 21, 2023, 5:49pm

Nope

response : {
  "error": {
    "message": "Your input image may contain content that is not allowed by our safety system.",
    "type": "invalid_request_error",
    "param": null,
    "code": "content_policy_violation"
  }
}

Fusseldieb · November 21, 2023, 6:35pm

No, not in this way!

Use LLaVa to describe the image in detail since it allows it, but then feed it into OpenAI’s GPT3.5 to classify it.

In essence, you must be SPECIFIC about what you want, but LLaVA doesn’t understand stuff in depth, so you take what it tells you and feed it into a more capable model, in this case, GPT3.5.

Eg. LLaVa might be sufficiently smart to tell you what’s in the image, eg. a naked woman, so you take that explanation and feed it into GPT3.5 telling it that it is for content moderation and that it should sort it depending on several points, and it should oblige.

A good prompt for LLaVa would be something like: “Describe what’s in the image as detailed as possible.” or something along the lines.

Topic		Replies	Views
Handling NSFW Content in Prompt Enhancement without Context Loss API chatgpt , api , gpt-4o , gpt-4o-mini	6	2121	January 13, 2025
Constant False Positives : image may contain content that is not allowed by our safety system API gpt-4	16	972	May 31, 2024
Analyzing images that may be NSFW using gpt-4-vision API gpt-4-vision , gpt4-vision	0	1515	May 30, 2024
Moderation API does not understand the concept of negative prompts API image-generation , moderation	10	3455	August 25, 2023
Use gpt-4o to label images for a nsfw dataset API	12	290	February 12, 2025

GPT-4-Vision detect NSFW images

Related topics