Custom content moderation - like but not based on OpenAI model

I have searched and do not see this kind of inquiry. There are many scenarios where content (email, announcements, contracts, product documentation) is created within a company (or by a blogger, or article writer, etc) and we want to ensure the content meets some defined standard. I’d like to know how we can create a model to do this. The Moderation Endpoint is a great example of this already in action, returning a numeric categorization on various criteria.

Examples:

  • Does the content end with a salutation, closing, or “thank you”?
  • Is there apology in the outbound note? That is, we never say the product is broken or that it needs to be fixed, we always thank the contact for their feedback.
  • Is there a promise or commitment for an action in the outbound note?
  • If this document fits a specific category, does it include a disclaimer with text like “…” ?
  • Does the corpus of the text follow a formal or informal tone?

This can be used for policy compliance, message format and content standardization, avoiding unfortunate statements of fitness or suitability, and so many other similar requirements.

For each policy/category there would be a value regarding compliance. Following the values returned from Moderation (and common anti-spam/malware) where a high value indicates a high probability of failure, a missing disclaimer might yield a high value for that category, and the presence of a Thank You near the bottom would yield a low value for that category. The company might strive for a middle-of-the-road, but slightly more formal persona, and a value for tone in the range of 5 to 7 might be considered ideal.

Or, am I still just dreaming? :slight_smile:

Thanks!

1 Like