Is there a way to know when GPT refuses to cooperate?

You can also take a look into this conversation where we discussed a somewhat similar case. (Edit: link at the bottom).
Here is a quick summary, as your question reminds me a lot of this conversation.
I will also point out what you could look out for:

I think it may be of interest to you that in this scenario the model is not referring to itself as AI and doesn’t apologize but instead falls back to the assigned role like so: As a Dungeon Master I cannot do XXX". This would already be a step forward from straight up immersion breaking to having a bad Dungeon Master.

Next you can look into what the model understands what you are trying to do each message, or especially when performing in game actions that trigger our most favorite “as a large language model” replies.

Then there was a case where something was injected into the context that made the model refuse to play as expected. Removing the bug from the script did already help a lot.

Can Ethics Be Adjusted for Gameplay?

Hope this helps!