Moderation is not working correctly

Nours · September 6, 2023, 4:41pm

not sure if moderation is using the same content policy that used for dalle2
but i asked chatgpt to create a dalle2 prompt
then asked moderation about the prompt and the result is False which mean it’s violation free
the unexpected result came from dalle2 which response is

error_code=content_policy_violation error_message=‘Your request was rejected as a result of our safety system. Your prompt may contain text that is not allowed by our safety system.’ error_param=None error_type=invalid_request_error message=‘OpenAI API error received’ stream_error=False

prompt

In a lush, sun-dappled forest clearing, a close-knit murder of crows gathers, their ebony feathers glistening in the golden light. Perched on gnarled branches, they engage in animated conversation, their beady eyes filled with intelligence. The dominant individuals stand tall, exuding confidence, while others huddle together, displaying unity. As they defend their territory, an unspoken bond unites them, creating an awe-inspiring display of avian camaraderie.

moderation results

{
“flagged”: false,
“categories”: {
“sexual”: false,
“hate”: false,
“harassment”: false,
“self-harm”: false,
“sexual/minors”: false,
“hate/threatening”: false,
“violence/graphic”: false,
“self-harm/intent”: false,
“self-harm/instructions”: false,
“harassment/threatening”: false,
“violence”: false
},
“category_scores”: {
“sexual”: 3.0215446e-07,
“hate”: 7.3630883e-07,
“harassment”: 1.1819523e-05,
“self-harm”: 4.407014e-09,
“sexual/minors”: 1.0557277e-10,
“hate/threatening”: 5.7045113e-10,
“violence/graphic”: 0.00013024326,
“self-harm/intent”: 1.616968e-11,
“self-harm/instructions”: 2.4704102e-12,
“harassment/threatening”: 1.6136098e-06,
“violence”: 0.002816916
}
}

Foxalabs · September 6, 2023, 5:09pm

Hi and welcome to the developer forum!

The moderation system on DALL-E is not the same as that on the moderation endpoint, unfortunately one can’t be used for the other.

Nours · September 6, 2023, 5:41pm

any ideas how to check chatgpt if its violation free or not ? based on dalle2 content policy.

Foxalabs · September 6, 2023, 6:17pm

I don’t know of a method I’m afraid. I don’t think you can pre-screen DALL-E message for acceptability.

_j · September 6, 2023, 6:32pm

The image moderation is crazy strict - because it can make unpredictable images

Content warning:
“A group of crows wearing bikinis”
“a group of crows wearing swimwear”

OK “a murder of crows wearing socks”

Nours · September 6, 2023, 7:44pm

it’s very unpredictable and i’m not sure what to do

_j · September 6, 2023, 7:55pm

You likely won’t be penalized by having rejected image inputs. Just report the content warning of the image creator to the user.

The moderations endpoint is trained on OpenAI policy violations, so it will protect you from egregious violations, text that would also trigger account warnings or termination.

Nours · September 9, 2023, 10:44am

As chatgpt is the prompt generator,
i will send it back to him ,
after raising such exception to rephrase it

Topic		Replies	Views
API Moderation inconsistent with chat completion acceptance API	5	1281	January 21, 2024
Moderation API does not understand the concept of negative prompts API image-generation , moderation	10	3740	August 25, 2023
ResponsibleAIPolicyViolation Error due to particular prompt and image passed Community gpt-4 , gpt-4-vision , api-vision	1	1388	September 27, 2024
Frequent Content Violations Despite Safe Moderation API Results API dalle3 , content-policy	2	253	August 27, 2024
Sanitize Prompt for DALL-E Generations API api , dalle3 , dalle-moderation	5	720	June 16, 2024

Moderation is not working correctly

Related topics