I’m wondering, because I see that “finish_reason” can be “content_filter”, however, I have seen ChatGPT flag not only my user input, but also at times flag only its own output; so I’m wondering if this moderation feature applies to both the input and the output? or if I need to run every new input message through the moderation endpoint and only rely on “finish_reason”: “content_filter” for the API’s own output…
Maybe if someone has experience, or can lead me to explicit documentation~ <3
For ChatGPT they run the moderation endpoint for the input, and during the output but it’s not recommended unless you want to take advantage of the results.
For the API you most likely want to run only the input through the moderation endpoint unless you are certain it’s going to be clean going in.
I believe if you are getting finish_reason: content_filter you can potentially have your API access revoked. I also believe it’s only meant for the input, and you wouldn’t start getting output that gets abruptly cancelled.
I could be mistaken and maybe they have decided to also vet the output on their end. I don’t believe there’s any documentation which states this and I doubt this is the case unless they sporadically check the output.
A finish reason of “content_filter” will be emitted by Azure’s version of OpenAI services.
If coding for that platform, you must right after seeing you actually get a valid JSON, look for this, even in stream chunks, because it won’t have other parts of the JSON (dictionary) that you may be hoping to extract.
Moderation won’t usually help, as Azure runs their own stricter filter, while moderation just looks for OpenAI policy violations. You can request that it be set to a lower threshold for your application.