I’ve developed a free-to-use Next.js website offering multiple AI text generation tools (like AI answer generators, product descriptions, title generators, etc.), using the ‘GPT-4o mini’ model. The site has been made public, and I’m monetizing it through Google AdSense as a part of my blogging activities.
However, I recently received a concerning email from OpenAI, stating that my organization is in violation of their Usage Policies, specifically related to the exploitation or harm of children. Here’s a snippet from the email:
“Hi, Organization org-JYuehevehdhdeuTY’s use of our services has resulted in a high volume of requests that violate our Usage Policies, related to:
Exploitation, harm, or sexualization of children.
We require organizations to use OpenAI’s services in line with our usage policies, including the use of our services by any of their end-users.”
I strongly suspect this is the result of a competitor abusing my platform, as I do not promote or condone any such behavior. I’m currently looking for advice from the community on how I can safeguard my platform against this kind of misuse and prevent any further policy violations.
I’d appreciate any guidance on how to handle this situation. Has anyone else encountered a similar issue? What measures can I put in place to ensure users on my site are following OpenAI’s guidelines?
No not yet, I am not sure if it will delay the response time?
Should I use it as middleware before sending the text to gpt-4o-mini model which I am using??
Hi, I implemented moderation api. All works fine, but it is becoming really irritating that the api is flagging even normal words like romantic, love, adult. How to resolve this. Anyone has idea??
You can consider that the same level and quality of inspection is being applied to examine your inputs and generations anyway to classify the level of account policy abuse. The moderations endpoint should align with the detection that caused the emailed warning.
Plus the moderations endpoint is happenstance, chunking the input for processing, and you can trim or pad the inputs and get significantly different results, just as OpenAI may be inspecting in a different manner than the exact user input or context that you send to the moderations endpoint. Likewise, you can classify user generations or entire chat contexts after the fact and start to score users to see who is the most triggering.
Got similar kind of email that my Organization was sending some requests which violated policies. I think this is an issue from OpenAI, even after moderation enabled this could potentially happen based on how the prompts could be used.
Example: my project is a news summary website, so if the news site has something written which OpenAI may not align to it will flag the content?
OpenAI team, please help and provide better support.