Responses API: Parallel Tool Calls Not Happening

I’m experimenting with the new Responses API, but I’m only ever seeing one function call per response, even though I’ve set "parallel_tool_calls": true.

My Responses API payload

{
  "input": [
    {
      "content": [
        { "text": "Can you check my tasks and notes on HubSpot?", "type": "input_text" }
      ],
      "role": "user"
    }
  ],
  "model": "gpt-4o",
  "instructions": "....",
  "parallel_tool_calls": true,
  "store": false,
  "user": "...",
  "tool_choice": "auto",
  "tools": [
    { "type": "web_search_preview" },
    {
      "name": "Tool_Nango_Hubspot_OwnerTaskList",
      "description": "Get my HubSpot tasks",
      "parameters": {...}
    },
    {
      "name": "Tool_Nango_Hubspot_OwnerNoteList",
      "description": "Get my HubSpot notes",
      "parameters": {...}
    }
  ]
}

What I get back

{
  "output": [
    {
      "type": "function_call",
      "name": "Tool_Nango_Hubspot_OwnerTaskList",
      "arguments": {
        "action_description": "Retrieving the most recent tasks for from HubSpot.",
        "object_type": "task",
        "sort_by": "hs_lastmodifieddate"
      }
    }
  ],
  "parallel_tool_calls": true,
  …
}

What I expected

• Both Tool_Nango_Hubspot_OwnerTaskList and Tool_Nango_Hubspot_OwnerNoteList to be emitted in the same Responses API reply, so I can execute them in parallel on my end.

Actual behavior

• Only the Tool_Nango_Hubspot_OwnerTaskList tool is ever called; the assistant stops before calling the Tool_Nango_Hubspot_OwnerNoteList tool.


Questions

  1. Does the Responses API currently support true parallel function invocation in GPT‑4o?

  2. Are there additional flags, instruction formats, or tool‑ordering requirements I’m missing?

  3. Any known limitations or best practices for getting multiple function calls in a single Responses API response?

Thanks in advance for any insights!

You might be missing that non-stream output is a list that needs to be iterated over. You can’t simply grab [0] and expect full contents there.

Correct tool placement

Descriptive non-strict function

{
  "name": "weather_conditions",
  "description": "current weather. Supports parallel call by placing in multi_tool_use",
  "strict": false,
  "parameters": {
    "type": "object",
    "required": [
      "location_city"
    ],
    "properties": {
      "location_city": {
        "type": "string"
      }
    },
    "additionalProperties": false
  }
}

Multiple need, multiple parallel call

Need to iterate demonstrated

>>> response.output[0]
...                     
ResponseFunctionToolCall(arguments='{"location_city":"San Francisco"}', call_id='call_EV6RlpSglyqjLBBkdmTL0gKC', name='weather_conditions', type='function_call', id='fc_67f54a025e988192ab0d8dbd537a54de0e33792bb0a1c99e', status='completed')

>>> response.output[1]
...                     
ResponseFunctionToolCall(arguments='{"location_city":"San Jose"}', call_id='call_QeZa7OjZKHWrhN2uJxdTXi7x', name='weather_conditions', type='function_call', id='fc_67f54a029c9c81929db97c6e9d21cb330e33792bb0a1c99e', status='completed')


I have set strict: false on each of the tools. I am not just looking at output[0], in what I included above you can see it’s the full output list.

I’m also using the golang sdk.

Then it is just down to model quality and instruction-following. I didn’t have to use the parallel parameter, as that is only for disabling the internal tool and saving you some tokens and some error-prone usage.

If you are getting tool use, and a pattern that understands the need to call them iteratively otherwise, you can even mandate in the function description that function cannot be used directly but is only to be used by parallel placement - sent to the parallel method of multi_tool_use tool recipient, to be more literal against what the AI sees internally.

Here’s the JSON body of the success stimulation above.

{
  "model": "gpt-4o",
  "input": [
    {
      "role": "system",
      "content": [
        {
          "type": "input_text",
          "text": "You are weatherpal"
        }
      ]
    },
    {
      "role": "user",
      "content": [
        {
          "type": "input_text",
          "text": "What's the temperature difference between SFO and San Jose?"
        }
      ]
    }
  ],
  "text": {
    "format": {
      "type": "text"
    }
  },
  "reasoning": {},
  "tools": [
    {
      "type": "function",
      "name": "weather_conditions",
      "description": "current weather. Supports parallel call by placing in multi_tool_use",
      "parameters": {
        "type": "object",
        "required": [
          "location_city"
        ],
        "properties": {
          "location_city": {
            "type": "string"
          }
        },
        "additionalProperties": false
      },
      "strict": false
    }
  ],
  "temperature": 1,
  "max_output_tokens": 2048,
  "top_p": 1,
  "store": false
}

Which is easy when you haven’t got tons of tokens of web results or file search results and your instruction-following quality is damaged by system message injection.

Interesting. When using the chat completions API with the same tool names, descriptions, and message input text, it does call the tools in parallel, so I’m curious as to why the responses API doesn’t behave the same.

Let’s look at context placement as a cause:

Responses

## multi_tool_use

// This tool serves as a wrapper for utilizing multiple tools. Each tool that can be used must be specified in the tool sections. Only tools in the functions namespace are permitted.
// Ensure that the parameters provided to each tool are valid according to that tool's specification.
namespace multi_tool_use {

// Use this function to run multiple tools simultaneously, but only if they can operate in parallel. Do this even if the prompt suggests using the tools sequentially.
type parallel = (_: {
// The tools to be executed in parallel. NOTE: only functions tools are permitted
tool_uses: {
// The name of the tool to use. The format should either be just the name of the tool, or in the format namespace.function_name for plugin and function tools.
recipient_name: string,
// The parameters to pass to the tool. Ensure these are valid according to the tool's own specifications.
parameters: object,
}[],
}) => any;

} // namespace multi_tool_use

Chat Completions

## multi_tool_use

// This tool serves as a wrapper for utilizing multiple tools. Each tool that can be used must be specified in the tool sections. Only tools in the functions namespace are permitted.
// Ensure that the parameters provided to each tool are valid according to that tool's specification.
namespace multi_tool_use {

// Use this function to run multiple tools simultaneously, but only if they can operate in parallel. Do this even if the prompt suggests using the tools sequentially.
type parallel = (_: {
// The tools to be executed in parallel. NOTE: only functions tools are permitted
tool_uses: {
// The name of the tool to use. The format should either be just the name of the tool, or in the format namespace.function_name for plugin and function tools.
recipient_name: string,
// The parameters to pass to the tool. Ensure these are valid according to the tool's own specifications.
parameters: object,
}[],
}) => any;

} // namespace multi_tool_use

Tool for sending in parallel seems the same.

And the function?

Wait: what’s this??

    {
      "name": "Tool_Nango_Hubspot_OwnerTaskList",
      "description": "Get my HubSpot tasks",
      "parameters": {...}
    },

Reminder: Responses has a different function format than chat completions. They cannot drop in. Five requirements at the top level of each object for you.

    {
      "type": "function",
      "name": "Tool_Nango_Hubspot_OwnerTaskList",
      "description": "Retrieves users tasks. Send in Parallel with any other non-dependent Hubspot function call.",
      "strict": false,
      "parameters": {...

Sorry when I was crafting the message I was trying to reduce the payload to make it easier to read, but I missed those two keys in the function definitions. Here’s the full payload for the request:

{
  "input": [
    {
      "content": [
        {
          "text": "Can you load up my hubspot tasks and notes?",
          "type": "input_text"
        }
      ],
      "role": "user"
    }
  ],
  "model": "gpt-4o",
  "instructions": "...",
  "parallel_tool_calls": true,
  "store": false,
  "user": "...",
  "tool_choice": "auto",
  "tools": [
    {
      "type": "web_search_preview"
    },
    {
      "name": "Tool_Nango_Hubspot_OwnerTaskList",
      "parameters": {
        "properties": {
          "action_description": {
            "description": "A brief description of the action to be performed.",
            "type": "string"
          },
          "object_type": {
            "description": "The type of object. Must be 'task' for this tool.",
            "enum": [
              "task"
            ],
            "type": "string"
          },
          "sort_by": {
            "description": "Property to sort the results by.",
            "enum": [
              "hs_lastmodifieddate",
              "hs_createdate"
            ],
            "type": "string"
          }
        },
        "required": [
          "action_description",
          "object_type",
          "sort_by"
        ],
        "type": "object"
      },
      "strict": false,
      "description": "Retrieves the 25 most recent tasks owned by the authenticated user in HubSpot. Use this tool to get tasks assigned to you.",
      "type": "function"
    },
    {
      "name": "Tool_Nango_Hubspot_OwnerNoteList",
      "parameters": {
        "properties": {
          "action_description": {
            "description": "A brief description of the action to be performed.",
            "type": "string"
          },
          "object_type": {
            "description": "The type of object. Must be 'note' for this tool.",
            "enum": [
              "note"
            ],
            "type": "string"
          },
          "sort_by": {
            "description": "Property to sort the results by.",
            "enum": [
              "hs_lastmodifieddate",
              "hs_createdate"
            ],
            "type": "string"
          }
        },
        "required": [
          "action_description",
          "object_type",
          "sort_by"
        ],
        "type": "object"
      },
      "strict": false,
      "description": "Retrieves the 25 most recent notes owned by the authenticated user in HubSpot. Use this tool to get notes created by you.",
      "type": "function"
    }
  ]
}

I just had a thought here - web search may be forcing strict on itself and producing a token enforcement. The same strict you can’t use yourself if you want parallel calls emitted.

Plus, the output of web search basically damages what follows.

You might have to make a choice, save yourself tokens that can never be employed and simply turn off the parallel tool, or make web search a function that you can be in control of.

Ah ok. I thought I tried removing the web_search tool already and still go the same behavior, but I just tried it again, and the parallel tool calling does seem to be working now when the web search tool is not included in the list of tools. Do you think this is by design? Or a bug?

I am often lost for OpenAI’s motivations in general…

Web search is not available on chat completions as a tool, so that give us an immediate cause for difference, both in placed context and its returns clouding the distance back to tool definitions. Also that we cannot compare to anything that does work.

If it absolutely cannot be overcome then it would appear to be a case of one tool in your list, out of your control, having the impression of “strict”:true to go along with it.

First have a tool that can’t be answered by a web search:

Run again with web search tool also added…

Answer received.

1 Like

That makes sense, thanks for your help!