Is the openai moderation baked in the models or explicitly moderation api integration is mandatory?

sayandigital · September 27, 2024, 2:25am

Is moderation api a mandatory layer.How is it different from the model inbuilt moderation.Example the 6 categories self harm,violence

_j · September 27, 2024, 4:29am

The moderations is a protective layer against unknown inputs that you can submit potential prompts to and either receive a flag in a specific category, or a score to take more refined action on.

They no longer document that API calls sent to moderations are an assurance of non-policy violation, and there is no connection with the API call you make after to say “hey, it was checked first”. This likely because there’s a lot of stuff that can fit between the categories and the unpredictable quality, giving scores greatly affect by just truncation off the front or back.

So, if you are in control of what’s being sent, you don’t need to use this endpoint. It also can’t keep up with the rate limits allowed at higher tiers, if those were actually users typing away. It also can’t do anything but text (now a new moderations for you with images) and things like uploaded PDFs or images to assistants can certainly cause AI production of undesired content.

The AI model is able to do it’s own refusal in many cases, the “I’m sorry, but I can’t assist with that” that shuts you down without discussion. It is not a “moderation” but just understanding. There is a separate inspection of outputs that looks for reproduction of copyright such as song lyrics and will terminate the output.

There is no documentation at all of the undisclosed scan-and-ban policies, just what you can read in terms and conditions.

platypus · September 27, 2024, 8:02am

Hi @sayandigital !

The difference is in the fidelity and the “intent of use”.

“Standard” OpenAI APIs like ChatCompletions has some guardrails with respect to policy of use - essentially a legal shield for OpenAI and its customers with respect to copyright infringement, harmful content etc. If you send harmful content to ChatCompletions, you are in violation of policy.

The Moderation API uses models specifically fine-tuned on moderation datasets. Also the Moderation API is by-design intended to receive potentially harmful and questionable material (so it can flag it for you).

Hope that clarifies it!

Topic		Replies	Views
Is the moderation API required? API	6	1863	December 20, 2023
I want to know about the pricing policy for the moderations API. API moderation	7	2820	December 26, 2023
Content moderation API	2	502	December 20, 2023
API endpoint to replicate examples on the "GPT-4 for content moderation" blogpost API moderation	3	945	December 16, 2023
Clarification on Using Moderation Model to Avoid Policy Violations API gpt-4 , api	3	263	October 9, 2024

Is the openai moderation baked in the models or explicitly moderation api integration is mandatory?

Related topics