Realtime api with error "MCP tool not found in cache, the session must have completed fetching the tools from the MCP server before a cached version can be used."

Hi team,

We’re using the Realtime API and encountering a problem when manually including MCP-related items in the input field of a response.create call.

Here’s the payload we’re sending:

{
  "type": "response.create",
  "response": {
    "output_modalities": ["audio"],
    "input": [
      {
        "type": "item_reference",
        "id": "item_CVEL8WHhlFHlvfQZ5ILXD"
      },
      {
        "type": "message",
        "role": "system",
        "content": [
          {
            "type": "input_text",
            "text": "[omitted for brevity]"
          }
        ]
      },
      {
        "server_label": "[hidden]",
        "type": "mcp_list_tools",
        "id": "item_CVEL1a5u3XNftgdSiKAkF"
      }
    ]
  }
}

However, when we trigger an MCP call afterward, we consistently get this error:

MCP tool not found in cache, the session must have completed fetching the tools from the MCP server before a cached version can be used.

Observed behavior

  1. This error can occur multiple times within the same conversation, not just on the first MCP invocation.
  2. When it happens, there’s a noticeable delay before the MCP call finally executes.
  3. Once the call succeeds, subsequent responses may still occasionally fail again later in the session.
  4. When we don’t include a custom input array, everything works smoothly and tools are accessed instantly.

Expected behavior

  • Once MCP tools are fetched and cached during session initialization, they should remain available and stable throughout the session.

  • Customizing response.create.input should not break or reset the MCP tool cache.

Question

Is this behavior expected?
Is there a recommended way to safely include MCP-related items in the input array without breaking the MCP cache or causing delayed tool calls?

Hey zhoulx,

1) response.create.response.input creates a brand-new context

When you pass response.input in a response.create call, you are not using the default conversation context anymore. The Realtime docs are explicit about this: providing input creates a new context for that response, and an empty array ([]) will even clear the context entirely.

That means anything the session previously cached — including the MCP tool list — may no longer be present in the active context for that response. So even though the tools were fetched earlier in the session, they’re effectively “gone” from this response’s view.

Docs:

https://platform.openai.com/docs/api-reference/realtime-beta-client-events/response/create


2) The injected mcp_list_tools item is incomplete

In your payload you’re manually adding something like:

{
  "type": "mcp_list_tools",
  "server_label": "...",
  "id": "..."
}

But per the spec, an actual mcp_list_tools item must include a full tools array, with each tool’s metadata (name, input_schema, etc.). When you inject a stub without that data, the system can’t treat it as a valid cached MCP tool list — it’s basically just an item-shaped placeholder, not a real tools cache.

Docs (item types & MCP tools):

https://platform.openai.com/docs/api-reference/realtime/events


3) You’re likely racing the MCP cache

The error message you’re seeing:

“MCP tool not found in cache, the session must have completed fetching the tools…”

is exactly what happens if you try to use MCP tools before the server has finished producing the real mcp_list_tools item for that MCP server. If your response is created (or your context overridden) before that event arrives, the cache simply isn’t ready yet.


How to fix it

Fix option A (recommended): Don’t override response.input

If you don’t actually need a brand-new context, just omit response.input entirely and let the response use the default conversation. This preserves the MCP tool cache and avoids the problem altogether.

Docs:

https://platform.openai.com/docs/api-reference/realtime-beta-client-events/response/create


Fix option B: Reference the real MCP tool list item

If you do need to customize response.input, don’t inject a synthetic mcp_list_tools object. Instead, wait until the server sends the real mcp_list_tools item (with the full tools array), then include it using an item_reference.

In other words, your response.input should reference the actual item ID the server created — not a manually constructed { "type": "mcp_list_tools" } stub.

Docs (raw items vs references):

https://platform.openai.com/docs/api-reference/realtime-beta-client-events/response/create


Fix option C: Wait until tools are fully fetched

On session startup, wait until you’ve received the server-produced mcp_list_tools event for that MCP server (or whatever “tools loaded” signal you’re using) before triggering any MCP tool calls. Until that event arrives, the cache is not guaranteed to exist.

Docs (MCP + Realtime flow):

https://platform.openai.com/docs/guides/realtime

1 Like