GPT 5.1 broke conditional tool calls

With GPT-5 I was able to run conditional tool calls. Since upgrading to GPT-5.1 it does not work. I’m using the Responses API.

Here’s an example: I have a send_email tool. I want to check for conditions first, then send an email only when conditions are met, and not send if conditions are not met. I run a prompt like this “check for condition X, if true send an email alert“. In this case checking condition X requires tool calls.

I make it clear in the system message to check conditions first. Here is the block on emails from my system message:

  • When user specifies conditions for send_email tool use, do this exactly: First explicitly evaluate the conditions. IF conditions are TRUE, send email. If conditions are FALSE, DO NOT call the send_email tool!

The problem is that when the condition is false, it still sends an email anyways, then says “I should not have sent an email under your rule—but I did trigger one anyway”.

If I split into two separate prompts it works. Like this:

  • Send message: “check condition X“…it calls the tool.
  • Send another message: “if condition is met send an email if not do nothing“

When split into two messages it works as intended. But when run as a single prompt it does not work. Any idea why this is happening? This functionality worked perfectly with GPT-5 and GPT-4.

I tried disabling parallel tool calls but still did not work.

One way I’m able to get it to work is by clearly separating each request by periods. For example:

  • “Check X condition if true send email.” does not work
  • “Check X condition. If true send email.“ does work.

This only works sometimes, its inconsistent even with periods.

Any ideas would be appreciated!