How to avoid being blocked when trying to filter potentially harmful content?

Hi all,

we need to process user comments for content moderation.

Unfortunately the moderation API is not sufficient as it lacks certain criterias we use to mark “negative” content.

Therefore we use the completions API with a system prompt like “Please provide a score between 1 and 10 to the following text and rate down if following criterias are matched [criterias follow].”

Now we received a warning from the OpenAI team because we are submitting harmful content.

The problem is: we need to submit harmful content as we want to filter it out.

Are there any possibilities to avoid being blocked by OpenAI?

Regards