As mentioned in the Moderation API guide, the API is used for:
monitoring the inputs and outputs of OpenAI APIs
I get the benefit of monitoring the inputs and how this prevents violations of the ToS, but how would monitoring the output benefit us as developers when the violating output was already given and might cause our account to be terminated?
Why do we need to monitor the outputs of GPTs, don’t they come with moderation embedded in them already?
I’m just trying to wrap my head around this moderation API as I got my other account terminated last week for no apparent reason from my side.
I don’t think you need to run the output thru any moderation at all. The API is sending that back to you and doesn’t even know what you’re doing with it after that right?
Moderation checking on the input and output is best practice, I would use it unless I had an application where a few hundred milliseconds was critical, even on streaming. I will send the tokens to the renderer as they arrive but check ever 15-20 tokens on a space/punctuation mark and test that block in another thread.
If I get a positive hit for moderation then I pull those tokens from the renderer, might not be a requirement now that there is a moderation stop reason, but I prefer belt and braces, API account is a valuable thing.