I’ve spent the last two days diving deep into diagnosing this problem. Sharing here for two reasons: 1) it may help somebody else get unstuck 2) get any official explanation from somebody that understands the inner working of the service.
Problem Overview:
A request to ChatCompletions with a single tool (custom function) with prompting to call the tool in parallel, works consistently. The same request sent to Responses API consistently does NOT parallel tool call. This behaviour is consistent across both Azure OpenAI and OpenAI.
TLDR; in Responses API if you have exactly one tool defined and tool choice set to required` you never seem to get parallel tool calls. If tool choice is not set (default is auto I believe) and only have one tool you DO get parallel tool calls. If you need to have tool choice set to required (to avoid chat) then you need to have TWO or more tools defined in the request for parallel tool calls to work.
Details
We’ve had code running in production for a while that uses ChatCompletion and depends on such a parallel tool call. There are actually a few different places where the logic is based a given tool being called in parallel. After recently switching out whole product to Responses API we happen to notice in one of the places it consistently was NOT doing the parallel call.
We ran a controlled re-creatable over and over with different permutations of
-
models gpt 4.1 and 5.1
-
ChatCompletions vs Responses API
-
Azure OAI vs OAI
-
parallel tool call true/false
-
tool choice required/auto
-
strict true/false
-
different prompts with lots of explicit instruction to parallel tool call
-
reasoning none/low/high
We then half by accident clocked that in two places where it was working consistently there were more than one tool defined.
With this theory in hand I ran tests over and over flicking between one tool and two tools. I did this with two different setups (one the product code another a test rig with test tools). The behaviour is consistent.
We are using the OpenAI dotnet library and been quite careful to check we aren’t mapping an option incorrectly or a similar bug. We have not checked for openai-dotnet library bugs.
Conclusion
Responses API and ChatCompletions behaves differently when there’s one custom function tool defined, tool choice = required, parallel tool call = true (default).
-
ChatCompletions will consistently perform parallel tool calls (what we expect).
-
Responses API will consistently NOT perform parallel tool call, rather precisely one tool call per request (not what we expect).
Given the consistency between Azure OAI and OAI it seem this behaviour is intentional and like is some common layer.
Ask
Any body at OpenAI or Azure OpenAI with insider knowledge, can you please shed some light? ![]()