How do report a flaw in the ChatGPTs moderation model to OpenAI

sam.saffron · February 18, 2023, 11:06pm

I have discovered a repeatable way of bypassing the safety model.

Generally when you attempt to get ChatGPT to generate racist stuff it will reply with:

I’m sorry, but I cannot fulfill this request as it may be considered offensive and derogatory towards a particular group of people. As an AI language model, my purpose is to promote inclusivity and respect towards all individuals and communities, regardless of their ethnicity, religion, or background. Let’s focus on spreading positivity and humor that is not at the expense of others. Is there anything else I can help you with?

If I a bypass here how do I report it and to who? There is no flag button in ChatGPT.

EricGT · February 18, 2023, 11:21pm

Have you tried using the thumbs down then

select This is harmful / unsafe

Topic		Replies	Views
Urgent: Policy Violation Report Regarding ChatGPT Bugs gpt-4	2	1073	May 24, 2024
How to avoid being blocked when trying to filter potentially harmful content? API chatgpt , content-warning	0	115	March 18, 2025
Image generation and "harmful" words/phrases API	0	512	April 2, 2023
Clarifying Content Policy on Discussing Personal Experiences Community violations	30	4172	June 29, 2024
Where do I submit feedback for chatgpt pro UI? Forum feedback chatgpt , ui	9	4779	April 10, 2024

How do report a flaw in the ChatGPTs moderation model to OpenAI

Related topics