Question about moderation for API usage

eslof.github · October 20, 2023, 11:36pm

I’m wondering, because I see that “finish_reason” can be “content_filter”, however, I have seen ChatGPT flag not only my user input, but also at times flag only its own output; so I’m wondering if this moderation feature applies to both the input and the output? or if I need to run every new input message through the moderation endpoint and only rely on “finish_reason”: “content_filter” for the API’s own output…

Maybe if someone has experience, or can lead me to explicit documentation~ <3

RonaldGRuckus · October 20, 2023, 11:49pm

For ChatGPT they run the moderation endpoint for the input, and during the output but it’s not recommended unless you want to take advantage of the results.

For the API you most likely want to run only the input through the moderation endpoint unless you are certain it’s going to be clean going in.

I believe if you are getting finish_reason: content_filter you can potentially have your API access revoked. I also believe it’s only meant for the input, and you wouldn’t start getting output that gets abruptly cancelled.

I could be mistaken and maybe they have decided to also vet the output on their end. I don’t believe there’s any documentation which states this and I doubt this is the case unless they sporadically check the output.

_j · October 20, 2023, 11:57pm

A finish reason of “content_filter” will be emitted by Azure’s version of OpenAI services.

If coding for that platform, you must right after seeing you actually get a valid JSON, look for this, even in stream chunks, because it won’t have other parts of the JSON (dictionary) that you may be hoping to extract.

Moderation won’t usually help, as Azure runs their own stricter filter, while moderation just looks for OpenAI policy violations. You can request that it be set to a lower threshold for your application.

Topic		Replies	Views
What's the point of Output Moderation? API gpt-4 , api , moderation	4	616	November 17, 2023
API Moderation inconsistent with chat completion acceptance API	5	1001	January 21, 2024
API Endpoints with Integrated Content Moderation API gpt-4 , gpt-35-turbo , api	34	4902	December 20, 2023
Moderation fail/strange API	1	713	December 18, 2023
Clarification on Using Moderation Model to Avoid Policy Violations API gpt-4 , api	3	90	October 9, 2024

Question about moderation for API usage

Related topics