Using ChatGPT to detect harmful behavior without losing access due to violating OpenAI's content policies

Open to feedback for a problem I’m trying to solve.

I’m working with content on a social platform, and I’d like to try to use ChatGPT to detect vague pii. Essentially, my idea is to send a prompt that asks something like “does this text seem to refer to or identify a private individual, who is not a public individual or celebrity? if this is likely please respond with {pii: yes}, if it is not likely or you aren’t sure respond {pii:no}” (actual prompt wording to change). I’m facing two primary issues trying to dive into this.

Text content that I will be sending to the api is likely to include violative content against OpenAI’s content policies, just like it would be violative against my platform’s policies. Is this likely to get my access to the api revoked?

Is there a better solution to explore for this issue? I looked into the moderation api, but it seems that there is not a categorization for pii that would help solve the issue I’m trying to address.

Thanks for any guidance.

Welcome to the community @yak

IIRC, the Azure Cognitive Services have a text Analytics API that can identify and redact PII.

2 Likes