Very confused about the content policy

So I have been using GPT-4 for editing my novel and getting assistance with brainstorming. This includes me pasting in portions of my chapters for more in depth reviews. Lately, I’ve been getting content policy violations for things that are honestly not even graphic, such as a scene describing a scene where the main character found a dead person. The only “violent” description included is a pool of blood underneath the body. I’ve read through the content policy it links upon warning me of a violation, and I simply can’t find ANYTHING that would suggest anything in my novel would violate the policy. Can someone please clarify what I must do to avoid these? I’ve been very pleased with the feedback the chat is giving me and would prefer not to have to stop using it over something that seems like an error to me.

2 Likes

If you are using GPT-4 on ChatGPT, you can click on the link when you get a content violation warning then click the link and submit feedback form with “Why is this violating content!” and provide feedback, which may improve, although it is indirect.

2 Likes

Welcome to the community!

The best thing you can do when dealing with all this is to be as vanilla and unoffensive as possible. Rated G, to the extreme. Appropriate for toddlers aged 0 to three.

The API is a little less restrictive, if you wanna give that a go.

But the best option would be to use an open source model.

I don’t think it’s an error. OpenAI is doing this as part of their ESG/CSR and possibly brand safety goals.

4 Likes

Okay, if it isn’t an error, why is a list of things that are not accepted in prompts not included in their policy? As of right now, I see nothing forbidding what I may have posted, in fact there is a marked encouragement of being allowed to use the service for the creative desires of the user. The only specified banned usage is those encouraging harm or breaking laws, and the explicit sexualization of minors. There is nothing even remotely close to suggesting that only G rated material is the only thing acceptable.

If what you say is the truth, why is their policy so juxtaposed with what they actually enforce? This is a paid service, how in the world do they think it is ethical to have a system that can enforce bans with a completely falsified policy statement?

1 Like

Good question! Part of the answer is that the reason why ‘not accepted things’ exist is that some users give it their best to make the model do potentially harmfull things. And OpenAI perceives this as a danger, since the victims of such actions are quite likely to seek redemption from OpenAI because they created the hammer that somebody else used to hurt them with. If the criteria were public it would be much more easy to work around them.

Aother part of the answer is the context where ‘not accepted things’ occur. As an example you can easily ask about ‘blood’ or ‘children’, but ‘blood and children’ is likely to be an issue. Even though good reasons exist to discuss both in the same conversation. But of course, also very good reasons to restrict such conversations exist.

Especially working with OpenAI but also Google, Anthropic (Claude) etc… currently comes at a cost of not being able to work freely for some time going forward.

1 Like

All great thoughts so far.

As a writer myself, I’m interested in the topic.

I’ve gotten a lot of “false positives,” but I do worry about accidentally pushing it too far. I think a human eventually looks at your history in that case, though? At least bans aren’t too automated.

That said, I do see false-positives all the time.

I’ve been thinking of doing a thread here in Community about writing long-form content with LLMs. Any interest? Lots of things I’ve noticed as I streamline my Fiction Factory process for the 21st century version. (BTW, the mid-20th century version of the Fiction Factory didn’t use that name, but it’s interesting too. Technology was a force multiplier. In that case, dictation and human assistants.

I personally think the hybrid opportunity window is quickly closing shut, though… or publishing will change again as everyone has access to force multipliers whether they could originally write a book or not. Then again, GPT-5 or at least -6 or -7 will likely be able to spit out entire novels (but at a super high cost?) I dunno.

Interesting times, though!

What I’ve seen on the open source models, the guardrails are very much needed as even innocuous phrases can get an LLM off on crazy tangents! This is why they err on side of caution and it can be “Finicky” or have false positives at times.

You’ve got numbers in your name, so I’m not sure you’ll stick around, but hope you do. As you can see from the answers, we’ve got a whip-smart group of devs in the community garden we’ve been growing for the last several years!

2 Likes

I created a “psychiatrist” to help analyze some of my characters. She is to be factual, and dethatched. However, her answers, even if clinical and text book quality, is flagged. Thanks heaven the response is so slow I sometimes have time to read most of it before it vanishes.

1 Like

ChatGPT-4 seems to be pretty unrestrictive when it comes to outputs. I’ve gotten outputs that mentioned stuff like disfigured bodies and tortured prisoners with no issue (as they should be, because simply mentioning violence isn’t the same as “glorifying” it, which I understood as being stuff like the Hostel films). Maybe I’ve just been lucky with ChatGPT-4, but I use it to generate around a 100 RPG adventures of all genres daily, without any of my outputs ever being flagged.

I definitely have gotten inputs flagged a few times though. Usually for silly things, like using the words “pounding” and “escort,” even though they were used in non-sexual contexts. I’ve had inputs flagged for more slightly more understandable reasons as well, though I still don’t agree they break policy violations, or shouldn’t anyway (one was just mentioning a character had hanged himself as a backstory for a haunting, which is Harry Potter levels of violence at worst). I simply started wording my prompts more carefully, and I can still get the outputs I want without issue.

Specifically mentioning the pool of blood being under the body might have been too much for the input filters. I’ve definitely used the words like “blood”, “dead” and “corpse” in my inputs without hitting the filter.

Meanwhile, with 3.5, I got outputs flagged for the ridiculous reasons, including a time when it used Popeye’s “Well, Blow Me Down!” catchphrase. Never used 3.5 for anything but things like “get well soon” messages after that.

Either way, you probably won’t get into trouble if you hit the filters with that sort of content, unless you’re hitting it several times a day.

I am sorry to say but due to the “offensive AI” issues that were discussed over the last year OpenAI apparently overshot their aim and absolutely maimed GPT4’s capabilities. On one hand to not offend anyone, on the other to save bandwidth.

It is a sad thing to say for me as an OpenAI fanboy, but in the current state this is probably not the best tool to base your work on.