Discrepancy Between Moderation API and DALL-E 3 Content Violation

aiza.tariq · July 18, 2024, 2:42am

Hi,

I am experiencing an issue where the Moderation API is marking the content as safe (false for violations), but DALL-E 3 is still flagging the same content as a violation. This discrepancy is significantly slowing down our image generation process.

Is there a specific keyword or set of keywords I can use to avoid content violations in DALL-E 3? Any advice on how to align the Moderation API results with DALL-E 3 would be greatly appreciated.

anon22939549 · July 18, 2024, 2:49am

Maybe if you share 3–5 examples of things being marked as safe then getting a Dall-E warning we’ll be able to help you out.

aiza.tariq · July 18, 2024, 2:55am

My content is mostly related to expecting mothers and young children. For example, an illustration of a pregnant woman in her third trimester in Vietnam. The woman is shown applying cool compresses and using hypoallergenic moisturizers to soothe itchy, raised patches on her belly, thighs, and arms. She is also depicted taking an oatmeal bath and wearing loose, soft clothing to manage her symptoms. The background includes a modern home environment in suburban Vietnam.

Sometimes this content passes through DALL-E 3 without any issues, but other times it raises content violations. When this happens, I slightly change the wording, and then it works. I am also worried that OpenAI might ban my account because of this. However, the Moderation API is marking the content as safe.

anon22939549 · July 18, 2024, 3:26am

You need to understand the moderation endpoint is for text while Dall-E has it’s own content filters related to images.

What you’re describing here is a pretty textbook example of something which is definitely not going to trigger the moderation endpoint but it’s high-risk for being rejected by Dall-E.

aiza.tariq · July 18, 2024, 3:38am

Do you believe my content contains any violation? Most of the time, I slightly change the text while keeping the same meaning, and then it is passed by DALL-E 3 and image is successfully generated. Why is that so? Is there any API similar to the moderation API for image generation that I can use before passing it to DALL-E?

anon22939549 · July 18, 2024, 3:54am

It doesn’t matter what I think.

The types of images you are generating are definitely in the neighborhood adjacent-enough to prohibited content for Dall-E that, depending on the specifics of the prompt, are going to trigger a warning.

Between the descriptions of rubbing lotion on bellies and thighs and taking a bath it shouldn’t come as a surprise the image generation requests are failing.

The best advice I can give here is to give the system enough context that it understands the images are intended to be entirely non-sexual in nature.

Asking for things like “line art,” a “clinical depiction,” or “suitable for inclusion in an informational pamphlet for a women’s health center” will probably help the model understand the innocent nature of your request.

aiza.tariq · July 18, 2024, 4:13am

That’s a good suggestion. Thanks.

jonahhawk · July 18, 2024, 4:17am

100% this. Dall-E guardrails are there to keep you from going over the cliff, but they are placed well away from the edge just in case your prompt bumps up and over the rail sometimes. I think that, since responses are not deterministic, there is no hard line that you can predictably push a prompt up to without going over. It’s a gray area and landing anywhere in that area is flagged.

Topic		Replies	Views
Frequent Content Violations Despite Safe Moderation API Results API dalle3 , content-policy	2	102	August 27, 2024
API Moderation inconsistent with chat completion acceptance API	5	1028	January 21, 2024
Error in Image Generation API - Rejection by Safety System API api , image-generation	3	1459	September 21, 2023
Issues with DALL-E Nonsense Text and Policy Restrictions Community dalle3	8	820	July 13, 2024
content_policy_violation in DALL-E 3 API For Non-English Prompts API dalle3	5	1440	January 11, 2024

Discrepancy Between Moderation API and DALL-E 3 Content Violation

Related topics