I’m building a consumer app and was wondering what are the best practices for applying the Moderations API.
Running moderation and completion requests in sequence will block any violating prompts, but as a result the UX will suffer. Most requests (which are compliant) will need an additional ± 2s (depending on the length and network speed) to resolve.
Looking at the Playground and ChatGPT network requests we can see that they send the completion and moderation requests simultaneously. The moderation request will flag a prompt sooner than the completion will resolve.
So simultaneously (async) is definitely the way to go in terms of UX.
OpenAI Support told me: “Sending any unfiltered prompts directly to the completion API with enough policy violations can result in account suspension or termination.”
Here are my questions:
- How many violations until I should ban my end-user’s account?
- How many violations until my developer account get’s terminated? Does the user id get factored in that decision (see: OpenAI API) ?
- How do you test violating prompts without getting terminated?
- gpt-3.5-turbo has built in moderations, do I need to forget about calling the Moderations API when using that model?
Thanks in advance.
You do not get into any “trouble” with OpenAI when your app checks prompts using the moderation api endpoint. That is what the endpoint is for.
Normally, for security reasons, companies will not publish these params for many reasons, which I will not go into detail in this reply. My apologies.
I think it is recommended by OpenAI that app developers use the moderation endpoint.
That is strictly up to your use case; but since the moderation endpoint returned various moderation classes, you might consider the type of moderation violation when designing your own policy.
Thanks for your help!
Re: How do you test violating prompts without getting terminated? I was talking about the completions endpoints not moderations
Then your question does not make sense @stefr because application do not “test” the completion endpoint in the context of moderation violations.
You are using the term “testing” in the content of moderations and violations and normally a prompt is tested using the moderation endpoint before being sent to the completion endpoint. If the text is flagged during moderation, it is not sent to the completion endpoint.
Do you see what I mean?
I’m not talking about testing prompts in isolation, but testing them when calling the completions and moderations APIs in parallel. Have a look at this ChatGPT example:
I sent the word F**K. This was sent to the completion api before the moderation api.
So the question is, if my app does this, does it technically count as a violation? Or does OpenAI store moderation calls to determine if a developer account is properly handling moderations in this manner? Or do I need to send a stop request if a prompt gets flagged?
Think I answered this clearly before, but here goes again:
If you are concerned about moderation flags you should call the moderation api endpoint before you call the completion. This is not “my advice” this is what the OpenAI API docs advise.
If you call the completion APIs without moderating and your prompt / messages are flagged, OpenAI will record this flag.
How OpenAI decides to manage flags is not public information.
I think it is clear.
You should call the moderation API endpoint if you do not have especially if if you, as a developer do not have your own pre-API call filters in place.
This was sent to the conversation endpoint (not completion), before it made a subsequent call to the moderation endpoint. It appears that the actual call to generate the completion for your conversation happens server side, and if the moderation did not flag the prompt the response is dynamically rendered on the page. The difference in the time it takes on your local machine is likely because it’s round trip twice.