Seeking Advice on Moderation Scores for ChatGPT API to Avoid Account Suspension

Hello *,

I am currently working on a project that involves using the ChatGPT API for content generation and interaction. As part of this, I need to implement robust content moderation to ensure compliance with OpenAI’s usage policies and to avoid any risk of account suspension.

I understand that setting appropriate score thresholds for different categories of undesirable content is crucial. The main categories I am targeting for moderation include:

  1. Toxicity
  2. Insults
  3. Profanity
  4. Violence
  5. Sexual Content

To minimize the risk of my account being suspended, I would like to know what score thresholds other developers are using or would recommend for these categories. So far, I am considering the following conservative thresholds:

  1. Toxicity: 0.4 or below
  2. Insults: 0.3 or below
  3. Profanity: 0.2 or below
  4. Violence: 0.3 or below
  5. Sexual Content: 0.2 or below

I am keen to get feedback on whether these thresholds seem appropriate or if there are better practices I should follow. Additionally, any tips on how to effectively monitor and adjust these thresholds over time would be greatly appreciated.

Thank you in advance for your advice!

Best regards,
Greg