Is the openai moderation baked in the models or explicitly moderation api integration is mandatory?

The moderations is a protective layer against unknown inputs that you can submit potential prompts to and either receive a flag in a specific category, or a score to take more refined action on.

They no longer document that API calls sent to moderations are an assurance of non-policy violation, and there is no connection with the API call you make after to say “hey, it was checked first”. This likely because there’s a lot of stuff that can fit between the categories and the unpredictable quality, giving scores greatly affect by just truncation off the front or back.

So, if you are in control of what’s being sent, you don’t need to use this endpoint. It also can’t keep up with the rate limits allowed at higher tiers, if those were actually users typing away. It also can’t do anything but text (now a new moderations for you with images) and things like uploaded PDFs or images to assistants can certainly cause AI production of undesired content.

The AI model is able to do it’s own refusal in many cases, the “I’m sorry, but I can’t assist with that” that shuts you down without discussion. It is not a “moderation” but just understanding. There is a separate inspection of outputs that looks for reproduction of copyright such as song lyrics and will terminate the output.

There is no documentation at all of the undisclosed scan-and-ban policies, just what you can read in terms and conditions.