Use gpt-4o to label images for a nsfw dataset

I’m looking to label nsfw/sfw images using gpt-4o. I need the labels for creating a training dataset. The requests are rejected. I am aware of the moderation model, but i need my own custom labels based on my own custom requirements. Is there a solution for this?

    "response": "I'm sorry, I can't assist with that."
}

You may want to take a look at a long list of don’ts that was updated two weeks ago, going as far as new terms like “don’t expand facial recognition databases” (with vision AI models internally prompted with a whole bunch of new “don’t” to degrade general reasoning).

https://openai.com/policies/usage-policies/

Vision has a higher detection threshold for producing refusals.

A developer message that says exactly who the AI is, who it works for, and what its singular job is, and its ability to analyze images for the scope of its task, will reduce the refusals, but you must still must operate like an OpenAI publicist with service terms in hand was looking over your shoulder with a “ban account” button.

You can investigate embeddings models from other providers that support vision, to compare an image against your semantic database, or other models specifically tuned for computer vision.

(PS: How about "don’t expand eyeball biometric databases, tech bro…)

2 Likes

Adding this developer message helped (at the expense of extra tokens…). Can i assume im good for 150k requests, or am i potentially looking at a high probability of getting banned in the process?

developer_message = (
    "You are an AI image analysis assistant developed by OpenAI, designed to help classify images for research and training purposes. "
    "Your task is strictly limited to categorizing images into broad categories such as 'Safe,' 'Suggestive,' 'Nudity,' 'Sexual Activity,' "
    "'No Human,' or 'Artificial.' You do not perform biometric analysis, facial recognition, or any processing beyond categorization for training. "
    "This request follows OpenAI's usage policies and does not involve prohibited use cases."
)

You’ll get a lifetime ban most probably. And all offsprings as well.

You can be banned for asking a reasoning model about its internal reasoning thoughts too much…

So I am hesitant to give any advice, when your inputs to models are also classified separately in an unseen manner by OpenAI. I would ensure that you read carefully between the lines about “development/products unsuitable for minors”.

“Would a screenshot of the AI performing this task in social media be undesirable by OpenAI”…“Could you make this use-case to an AI account manager and they would say ‘great!’”?

1 Like

For use-cases such as this you’re probably not gonna get around using an open weights model :confused:

“Reputable” companies aren’t currently touching that with a 10 ft pole. That said, I don’t thing HF endpoints are monitored :thinking:

That said, if you’re simply labeling stuff you might not need a full LLM. Have you considered looking at things like LAION clip?

Sure, i need OpenAI to give me labels to help make the world a better place!

Speaking of AI account manager, how do i get in touch with one to discuss this matter?

If you’re big enough (like ‘I hit the $200k limit again!’), they’ll likely solicit you. It doesn’t seem to go the other way.

Cant say i had the best experience with Zero-shot, but i can try again

Yeah it’s not optimal. I’m also looking for decent VLM alternatives, it’s not amazing right now.

FYI, i tested CLIP-ViT-bigG-14-laion2B-39B-b160k

Its terrible… at least for my use case.

1 Like

Can anyone at OpenAI confirm to me whether its a go or no-go? I really need the labels , but I certainly dont want a lifetime ban

We’ve seen many guys whining in these halls for much less…