Realtime api with error "MCP tool not found in cache, the session must have completed fetching the tools from the MCP server before a cached version can be used."

Hi team,

We’re using the Realtime API and encountering a problem when manually including MCP-related items in the input field of a response.create call.

Here’s the payload we’re sending:

{
  "type": "response.create",
  "response": {
    "output_modalities": ["audio"],
    "input": [
      {
        "type": "item_reference",
        "id": "item_CVEL8WHhlFHlvfQZ5ILXD"
      },
      {
        "type": "message",
        "role": "system",
        "content": [
          {
            "type": "input_text",
            "text": "[omitted for brevity]"
          }
        ]
      },
      {
        "server_label": "[hidden]",
        "type": "mcp_list_tools",
        "id": "item_CVEL1a5u3XNftgdSiKAkF"
      }
    ]
  }
}

However, when we trigger an MCP call afterward, we consistently get this error:

MCP tool not found in cache, the session must have completed fetching the tools from the MCP server before a cached version can be used.

Observed behavior

  1. This error can occur multiple times within the same conversation, not just on the first MCP invocation.
  2. When it happens, there’s a noticeable delay before the MCP call finally executes.
  3. Once the call succeeds, subsequent responses may still occasionally fail again later in the session.
  4. When we don’t include a custom input array, everything works smoothly and tools are accessed instantly.

Expected behavior

  • Once MCP tools are fetched and cached during session initialization, they should remain available and stable throughout the session.

  • Customizing response.create.input should not break or reset the MCP tool cache.

Question

Is this behavior expected?
Is there a recommended way to safely include MCP-related items in the input array without breaking the MCP cache or causing delayed tool calls?