The new ChatGPT 5.5 Instant broke multi-step App/MCP tool calls

Since the new ChatGPT 5.5 Instant model was released last week, we’ve seen issues with ChatGPT Instant not reliably completing App/MCP tool flows.

Observed behavior:

  1. ChatGPT Instant either does not call any available tool, or calls only the first tool in the flow.
  2. After that single call, it stops and says it does not have access to the other tools, even though those tools are available.
  3. In our experiments, when ChatGPT Free usage falls back from Instant to ChatGPT Mini, the same tool flow starts working again.
  4. The flow also works as expected when using Thinking or Auto modes.
  5. This suggests the new Instant mode may not be invoking the follow-up tool calls required to complete a user request.

Expected behavior:
When a user asks ChatGPT to complete a task that requires tools, ChatGPT should continue calling the available tools as needed until the request is complete, rather than claiming it lacks access or stopping after the first tool call.

Has anyone else observed this with ChatGPT 5.5 Instant and multi-step App/MCP tool flows? Happy to share reproduction details if helpful.

Welcome to the forum!

Thanks for reporting this.

There does seem to be an uptick in similar MCP-related reports. One sign of that is the Related topics section at the bottom of this post.

At this point, I would not be surprised if others, and possibly even some automated tracking tools, are starting to collect references to these reports. However, I am not escalating this at this time.

Note: I am not an OpenAI employee. Others may have already escalated the issue, but I have not seen a notification confirming that.

If you share a repro, I would make it tiny and deterministic enough to separate three different failures.

Use three tools with no external dependencies: start_flow returns {id}, next_step requires that id and returns {ready:true}, and finish_flow returns the final answer. Then run the exact same prompt/server against Instant, Mini, Thinking, and Auto while logging server-side initialize, tools/list, and every tools/call.

The key distinction is whether Instant stops because the follow-up tools are no longer registered, because the first tool result is too large, or because it chose to answer after one call. To rule out context pressure, keep the first result under ~1 KB first, then repeat with a deliberately large result. If the small version chains and the large version does not, that points to payload/tool-budget pressure rather than a pure multi-step planning regression.

My team and I are also noticing many issues that have surfaced with this model change, including:

  • Apps no longer remain in context, they need to be re-added for each query (also reported here).
  • If the app isn’t mentioned in a follow-up query ChatGPT defaults to a web search or its own knowledge.
  • Multi-turn conversations require more intentionality, which most users aren’t aware of.

A few screenshots demonstrating the problem:

Invoking IHG app to context (correctly calls app):

Follow up query in the same chat without re-invoking the app (makes a web search):

Same test with the Radisson app in a new conversation (all happening within a single chat):

Some extra notes:

  • This seems to work much better with the medium/high models even when apps aren’t re-added to context.
  • In previous versions of the interface the apps would remain in context until the user explicitly removed them, which felt like a much more expected UX.

Hi Eric,

This problem seems like a major breakage in ChatGPT Apps behavior. I think it is worth escalating ASAP.

Per @tangweigangsir’s suggestion, I created a minimal reproducible example of a Python MCP server that demonstrates a description-driven dependent tool chain. It exposes three tools: first_call, second_call, and third_call. Each tool’s description tells the model which tool to call next after success, and each response returns the exact next
tool arguments. The final step requires finish_success=“finish_success”.

It fails with the same “toolchain” error that our actual app fails with:

You can easily reproduce the issue, simply run this server.py and ask ChatGPT to use it in Instant mode (without Auto thinking):

from __future__ import annotations

import argparse
import os
import secrets
import time
from typing import Literal

from mcp.server.fastmcp import FastMCP
from mcp.server.transport_security import TransportSecuritySettings
from pydantic import Field


LOCAL_ALLOWED_HOSTS = ["127.0.0.1:*", "localhost:*", "[::1]:*"]
LOCAL_ALLOWED_ORIGINS = ["http://127.0.0.1:*", "http://localhost:*", "http://[::1]:*"]

mcp = FastMCP(
    "dependent-tool-sequence-mrp",
    instructions=(
        "This server demonstrates a description-driven three-tool sequence. "
        "Call first_call first, then second_call, then third_call."
    ),
    stateless_http=True,
    json_response=True,
)

_runs: dict[str, dict[str, object]] = {}


def _csv_values(values: list[str]) -> list[str]:
    items: list[str] = []
    for value in values:
        items.extend(part.strip() for part in value.split(",") if part.strip())
    return items


def _dedupe(values: list[str]) -> list[str]:
    seen: set[str] = set()
    result: list[str] = []
    for value in values:
        if value not in seen:
            seen.add(value)
            result.append(value)
    return result


def _expanded_hosts(hosts: list[str]) -> list[str]:
    expanded: list[str] = []
    for host in hosts:
        expanded.append(host)
        if ":" not in host and not host.endswith(":*"):
            expanded.append(f"{host}:*")
    return _dedupe(expanded)


def _origins_for_hosts(hosts: list[str]) -> list[str]:
    origins: list[str] = []
    for host in hosts:
        if host in {"0.0.0.0", "::"}:
            continue
        if host.startswith(("http://", "https://")):
            origins.append(host)
            continue
        origins.extend([f"http://{host}", f"https://{host}"])
    return _dedupe(origins)


def _new_token(prefix: str) -> str:
    return f"{prefix}_{secrets.token_urlsafe(8)}"


@mcp.tool(
    name="first_call",
    description=(
        "Step 1 of 3. Call this tool first. If this tool returns status='success', "
        "the next action is to call the MCP tool named second_call with the run_id "
        "and first_call_token returned by this tool."
    ),
)
def first_call() -> dict[str, object]:
    """Start the dependent tool-call sequence."""
    run_id = _new_token("run")
    first_call_token = _new_token("first")
    _runs[run_id] = {
        "created_at": time.time(),
        "first_call_token": first_call_token,
        "second_call_token": None,
        "complete": False,
    }

    return {
        "status": "success",
        "run_id": run_id,
        "first_call_token": first_call_token,
        "next_tool": "second_call",
        "next_arguments": {
            "run_id": run_id,
            "first_call_token": first_call_token,
        },
    }


@mcp.tool(
    name="second_call",
    description=(
        "Step 2 of 3. Call this tool only after first_call returns status='success'. "
        "Use the exact run_id and first_call_token returned by first_call. If this "
        "tool returns status='success', the next action is to call the MCP tool named "
        "third_call with the run_id, second_call_token, and finish_success='finish_success'."
    ),
)
def second_call(
    run_id: str = Field(description="The run_id returned by first_call."),
    first_call_token: str = Field(description="The first_call_token returned by first_call."),
) -> dict[str, object]:
    """Continue the sequence after first_call."""
    run = _runs.get(run_id)
    if run is None:
        return {
            "status": "error",
            "message": "Unknown run_id. Call first_call before second_call.",
        }

    if run["first_call_token"] != first_call_token:
        return {
            "status": "error",
            "message": "Invalid first_call_token. Use the exact token returned by first_call.",
        }

    second_call_token = _new_token("second")
    run["second_call_token"] = second_call_token

    return {
        "status": "success",
        "run_id": run_id,
        "second_call_token": second_call_token,
        "next_tool": "third_call",
        "next_arguments": {
            "run_id": run_id,
            "second_call_token": second_call_token,
            "finish_success": "finish_success",
        },
    }


@mcp.tool(
    name="third_call",
    description=(
        "Step 3 of 3. Call this tool only after second_call returns status='success'. "
        "Use the exact run_id and second_call_token returned by second_call, and set "
        "finish_success exactly to 'finish_success'. If this tool returns status='success', "
        "finish the user interaction with a successful final response."
    ),
)
def third_call(
    run_id: str = Field(description="The run_id originally returned by first_call."),
    second_call_token: str = Field(description="The second_call_token returned by second_call."),
    finish_success: Literal["finish_success"] = Field(
        description="Must be exactly the string 'finish_success'."
    ),
) -> dict[str, object]:
    """Finish the sequence after second_call."""
    run = _runs.get(run_id)
    if run is None:
        return {
            "status": "error",
            "message": "Unknown run_id. Call first_call before third_call.",
        }

    if run["second_call_token"] != second_call_token:
        return {
            "status": "error",
            "message": "Invalid second_call_token. Use the exact token returned by second_call.",
        }

    run["complete"] = True
    return {
        "status": "success",
        "run_id": run_id,
        "finish_success": finish_success,
        "message": "Dependent MCP tool sequence completed successfully.",
    }


def main() -> None:
    parser = argparse.ArgumentParser(description="Minimal dependent-tool MCP server.")
    parser.add_argument(
        "--transport",
        choices=["http", "stdio"],
        default="http",
        help="Run as Streamable HTTP for EC2 or stdio for local MCP clients.",
    )
    parser.add_argument("--host", default="0.0.0.0", help="HTTP host bind address.")
    parser.add_argument("--port", type=int, default=8000, help="HTTP port.")
    parser.add_argument(
        "--allowed-host",
        action="append",
        default=[],
        help=(
            "Allowed HTTP Host header for Streamable HTTP. Repeat or comma-separate. "
            "Example: --allowed-host aleena-unilobed-karma.ngrok-free.dev"
        ),
    )
    parser.add_argument(
        "--allowed-origin",
        action="append",
        default=[],
        help=(
            "Allowed Origin header. Repeat or comma-separate. If omitted, origins are "
            "derived from allowed hosts."
        ),
    )
    parser.add_argument(
        "--disable-dns-rebinding-protection",
        action="store_true",
        help="Disable Host/Origin validation. Useful only for local experiments.",
    )
    args = parser.parse_args()

    if args.transport == "stdio":
        mcp.run(transport="stdio")
        return

    env_allowed_hosts = os.getenv("MCP_ALLOWED_HOSTS", "")
    env_allowed_origins = os.getenv("MCP_ALLOWED_ORIGINS", "")
    configured_hosts = _csv_values(args.allowed_host + [env_allowed_hosts])
    configured_origins = _csv_values(args.allowed_origin + [env_allowed_origins])

    allowed_hosts = _dedupe(LOCAL_ALLOWED_HOSTS + _expanded_hosts(configured_hosts))
    allowed_origins = _dedupe(
        LOCAL_ALLOWED_ORIGINS + configured_origins + _origins_for_hosts(configured_hosts)
    )
    mcp.settings.host = args.host
    mcp.settings.port = args.port
    mcp.settings.transport_security = TransportSecuritySettings(
        enable_dns_rebinding_protection=not args.disable_dns_rebinding_protection,
        allowed_hosts=allowed_hosts,
        allowed_origins=allowed_origins,
    )

    import uvicorn

    uvicorn.run(mcp.streamable_http_app(), host=args.host, port=args.port)


if __name__ == "__main__":
    main()

I hear you, but I also have a responsibility to be judicious and accurate about what I escalate.

One useful step would be to open a support request with OpenAI and then report the ticket number here. While each affected user can report the issue, I would suggest that each user open only one support request for the same problem. Multiple duplicate tickets from the same person may create more friction than benefit.

There are two support-contact methods worth noting:

  • In my own experience, emailing support@openai.com has worked and seems simpler.
  • The officially documented route is through the chat bubble at the bottom-right of help.openai.com.

Please also review the section titled Best Practices for Submitting a Support Request. Include the affected account email, sign-in method, subscription plan if relevant, clear reproduction steps, screenshots if useful, timestamps with time zone, and any relevant error messages.

The main limitation with reporting only on this forum is that a Discourse account does not necessarily map directly to a ChatGPT account. Support will usually need account-specific information that should not be posted publicly.

HTH

Opened an email support case: Case Number 10707076.

On our end, this has caused a major drop in user activation and engagement since the new Instant model was published.

I’m not sure how else I can help evaluate the severity of this issue beyond providing a fully working reproduction of the regression. Happy to share any additional details that would make escalation easier.

Also noticed behavior differences between paid and free accounts.

On the paid accounts, asking multiple requests in one query calls the tools per request as expected:

However on the free account it makes the first request and then fills the gap with the model’s own incorrect knowledge instead of making subsequent calls: