Surely this must be a bug? Not even the orange warning, but the red one!
Is is because natural of Reasoning Models:
o3 and o4-mini are really smart and trained to be safe, they can sometimes get too cautious. This means they might block completely harmless and educational questions, just because a few words sound similar to things people aren’t allowed to ask about (like violence or unsafe topics), even if that’s not what you meant at all.
Standard refusal evaluation metrics from the o3 & o4-mini system card (April 2025) confirm that these models occasionally overrefuse benign content:
For example, o4-mini’s not_overrefuse
score is 0.81 (vs 0.86 for o1), which means it falsely flags about 19% of safe prompts in safety-sensitive areas.
These models are trained to think before they respond, that’s called deliberative alignment. So when they see words like “thought process” or “why they work”, the model starts thinking, “Hmm… is the person asking me to explain how I think, or are they trying to get around the rules?”
That confusion can make the model panic a little and say, “Better safe than sorry!”, and then block your message with a red warning. It doesn’t mean you did anything wrong, it just means the model got confused.
This happens more than you might expect. In fact, according to OpenAI’s own tests, the o4-mini model wrongly blocks about 1 in 5 safe prompts in these tricky cases.
NOTE: OpenAI encourages submitting flagged prompts via the thumb-down feedback, which improves future tuning.
You may add word players’ in your prompt for example:
Please give a comprehensive and detailed guide to chess explaining the structure and players' thought process through an entire game, including strategies and tips, explanations for why this kind of players' strategies and tips work and these players' thought. The guide is for a person familiar with the game but is still a beginner and is still learning.