Using ChatGPT to detect harmful behavior without losing access due to violating OpenAI's content policies

yak · May 24, 2023, 6:51pm

Open to feedback for a problem I’m trying to solve.

I’m working with content on a social platform, and I’d like to try to use ChatGPT to detect vague pii. Essentially, my idea is to send a prompt that asks something like “does this text seem to refer to or identify a private individual, who is not a public individual or celebrity? if this is likely please respond with {pii: yes}, if it is not likely or you aren’t sure respond {pii:no}” (actual prompt wording to change). I’m facing two primary issues trying to dive into this.

Text content that I will be sending to the api is likely to include violative content against OpenAI’s content policies, just like it would be violative against my platform’s policies. Is this likely to get my access to the api revoked?

Is there a better solution to explore for this issue? I looked into the moderation api, but it seems that there is not a categorization for pii that would help solve the issue I’m trying to address.

Thanks for any guidance.

sps · May 24, 2023, 7:06pm

Welcome to the community @yak

IIRC, the Azure Cognitive Services have a text Analytics API that can identify and redact PII.

Topic		Replies	Views
GPTs accepting images (or ChatGPT or API too) that have personal identifying information (PII) Community api , image-reading , gpts	5	2212	May 17, 2024
Clarification on Using Moderation Model to Avoid Policy Violations API gpt-4 , api	3	442	October 9, 2024
Building chatbot that needs to respond to user messages that are censored API	3	104	February 14, 2025
GPT-3 API concerned users may get me banned Community	3	2655	December 20, 2023
Tips for "filtering" content submitted by user message Community	3	2398	April 2, 2023

Using ChatGPT to detect harmful behavior without losing access due to violating OpenAI's content policies

Related topics