I’m using API to do something similar to the chat preset in playground.
With prompt engineering I’m going to create a series of characters with specific personalities. I have all kind of people, made for books or stage plays etc.Sometimes a character has its rough language and its preferred topics that can goes borderline with the moderation thresholds.
I actually don’t use the free moderation endpoint (I use the API for myself), so I expected to have something that is following strictly the rules of my prompt. I see that this is not true, and I understand that, it’s a choice of Openai.
But I wanted to know if my account is somewhat in danger, because sometimes the input is going to express a concept or a word that would certainly fail moderation.
There’s any way to know If I’m already at risk with my account? Will I receive any warning in a mail for that?
Really, I don’t care about moderate myself (response time is already not ideal): if the answer given by the API is filtered, for me it’s fine.