DALLE3 - Instruction how to rewrite prompt to pass content filters

Ok, so we have a product that uses DALLE3 for image generation, but it fails about 10% of the time due to content moderation filters. Users often can’t understand why their prompts are rejected, which has been frustrating for them.

We’ve tried using GPT-4 to rewrite the prompts when they fail the filter, but it doesn’t always work. We need a reliable set of instructions to help rewrite prompts so they comply with DALLE3 content filters.

Here’s what we have so far:

Rewrite the prompt to be simpler and avoid any references to NSFW content, copyrighted characters, or controversial topics. When mentioning art styles, only include artists whose work predates 1912, or describe the style in general terms. Ensure the prompt focuses on safe, universally acceptable themes without explicit, violent, or inappropriate content. Avoid political, controversial, or sensitive issues that might provoke or offend.

Any suggestion how to improve this to catch more cases would be great.

Also rejected are any proper names, trademarks, 20th-century artists, copyrighted characters, along with particular words that may be specific to global conflicts or that may be derogatory or improper ways to refer to a race, culture, or protected class of marginalized people.

Since you won’t get any answers out of the AI, nor would an AI be able to figure out why many of these words would be bounced (like Kyiv), it is best to just return similar guidelines as the breadth of what does indeed trigger the DALL-E prompt detection.

You can also use the AI that is in front of DALL-E to your advantage, by talking to it directly and breaking out of the prompt, adding instructions for what it must rewrite without refusal instead of passing unaltered to get rejected or rejecting itself, and then give your injection of instructions clear separation from the user image prompt (making nesting dolls of instructions…)