[Bug/Regression] MCP tool result wrapped as `multimodal_text` + `parts[]` — model cannot extract structured data, chained tool call fails with fabricated ID

Context

I’m integrating a third-party MCP server with ChatGPT. The server exposes three tools:

  • search_businessesId_by_name — searches a business by name, returns a short JSON list with IDs
  • search_businesses — searches by activity/category, returns a structured JSON list
  • get_business_details_and_phone — fetches full details for a given business_id

The expected flow for a named-entity query (e.g. “give me all info on Domaine de Locguénolé in Kervignac”) is:

  1. search_businessesId_by_name → returns { id: "00257381", name: "Domaine de Locguénolé", ... }
  2. get_business_details_and_phone(business_id: "00257381") → returns full details

Observed behavior

Step 1 — search_businessesId_by_name is called correctly, but its result is wrapped in a non-standard format in the debug panel:

{
  "content_type": "multimodal_text",
  "parts": [
    "Citation Marker: @filecite@turn0file0@",
    "[L1] { [L2] ... [L3] \"name\": \"Domaine de Locguénolé\" ... }"
  ]
}

The actual business id ("00257381") is buried inside the fragmented parts[1] string and is not accessible as a structured field.

Step 2 — get_business_details_and_phone is called, but with a fabricated ID:

// Tool input
{ "business_id": "0" }

// Tool output
{ "id": "", "is_error": true }
// Widget: state: null, responseMetadata: null, externalCallTimeMs: null

The model then reports to the user that “details are not available”, even though the business exists and the MCP server is fully functional.


Reference conversations

Working conversation (no parsing issue, correct chaining):
6a047239-bcd8-8329-9428-4de80f2b6120

Broken conversation (exhibits the bug described above):
6a217530-c564-8330-a376-e96a2475e3e2
6a217136-ff08-8329-ac93-7ae94c22a12a

Same MCP server, same query — the two conversations can be compared directly.


Key evidence: same MCP server works correctly on Claude

I tested the exact same query (search_businessesId_by_name, who: "Domaine de Locguénolé", location: "Kervignac") via Claude, which also connects to this MCP server.

Claude receives the tool result as plain JSON — no multimodal_text wrapping, no parts[], no citation markers:

{
  "total_count": 1,
  "results_returned": 1,
  "results": [{
    "id": "00257381",
    "name": "Domaine de Locguénolé",
    "address": "Route de Port Louis Le Hingair 56700 Kervignac",
    "main_activity": "restaurants",
    "permanently_closed": false
  }]
}

Claude correctly extracts id: "00257381" and chains immediately to get_business_details_and_phone, returning the full business profile.

The MCP server applies no differentiation between clients — the response format is identical on the wire. The multimodal_text wrapping is introduced by ChatGPT.


Root cause hypothesis

ChatGPT appears to route some MCP tool results through its internal citation/grounding pipeline (typically used for web search or file retrieval). This pipeline:

  1. Wraps the response in { content_type: "multimodal_text", parts: [...] }
  2. Fragments the JSON content into line-indexed strings [L1], [L2]
  3. Injects a citation marker @filecite@turnXfileY@

This makes the structured data opaque to the model. It cannot extract named fields (like id), substitutes a default value ("0"), and proceeds — resulting in a server-side error.

Note: search_businesses (different tool, same server) does not trigger this wrapping and works correctly. The difference may lie in how each tool’s response schema is interpreted by the pipeline.


Side note: this behavior started appearing after the fix deployed for the issue tracked in REGRESSION: Free ChatGPT accounts unable to invoke apps! - #29 by reagent-brian — not sure if it’s related, but worth mentioning in case the fix introduced a side effect on tool result handling.

Thanks for reporting. Bumping for visibility.

Additional observation that might help narrow down the root cause:

search_businessesId_by_name and search_businesses are very similar tools on the same MCP server, and their response payloads are structurally close. The most notable difference is that search_businesses comes with a UI widget (rendered correctly in the debug panel), while search_businessesId_by_name has no UI component at all.

My hypothesis is that the absence of a UI widget on search_businessesId_by_name may be what triggers the multimodal_text + parts[] wrapping — as if ChatGPT’s pipeline falls back to its citation/grounding mode when it doesn’t find an expected UI definition in the tool response, instead of treating the result as plain structured JSON.

If that’s the case, the fix might be on the pipeline side: tool results with no UI should be passed through as-is, not routed through the citation pipeline.

As mentioned earlier: this bug started appearing right after the fix deployed for REGRESSION: Free ChatGPT accounts unable to invoke apps! - #29 by reagent-brian.

Bumping this with additional evidence, as this bug may be harder to reproduce in a generic setup — it only affects MCP tools that have no UI widget associated with their response.

I ran the exact same query on three different clients connected to the same MCP server:


1. ChatGPT — broken :cross_mark:


2. Claude — working :white_check_mark:

3. MCP Inspector — working :white_check_mark:

{
  "total_count": 2,
  "results_returned": 2,
  "results": [
    {
      "id": "00673379",
      "name": "Leroy Merlin Rennes Sud - Chantepie",
      "address": "Parc D' Activites Rocade Sud 16 all Guerlédan 35135 Chantepie",
      "main_activity": "bricolage, outillage",
      "permanently_closed": false
    },
    {
      "id": "09436522",
      "name": "Leroy Merlin",
      "address": "Rocade Nord Rd 29 Zac Pluvignon 35830 Betton",
      "main_activity": "bricolage, outillage",
      "permanently_closed": false
    }
  ]
}

All three clients hit the same MCP server, same tool, same query. The server returns identical JSON in all cases — the multimodal_text wrapping is introduced exclusively by ChatGPT.

The key differentiator between the tools on this server: search_businesses has a UI widget and works correctly in ChatGPT. search_businessesId_by_name has no UI widget and triggers the broken behavior. This strongly suggests ChatGPT’s pipeline falls back to citation/grounding mode when no UI definition is found in the tool response, instead of passing the JSON through as-is.


On a side note — between the previous bug and this one, we’ve been struggling to properly showcase our app for close to a month now. We’re the team behind the pagesjaunes MCP integration, which is the main business directory for the French market, so this has been a bit of a tough stretch for us. No pressure at all, but if anyone could share a quick status on whether this is being looked into, it would genuinely help us plan on our end.

Hi! We recently faced the same problem with our MCP ( github - ignfab/geocontext ).

I asked a friend (Claude) to write some tests to ensure that server response was compliant to the MCP protocol and that the problem was in ChatGPT’s integration layer yesterday ( ignfab/debug-geocontext-chatgpt )

The good news it that it seems to be solved today (2026-06-12)! “What is the exact position of the Eiffel Tower according to the MCP geocontext?” is now correct (no more JSON conversion into multimodal_text with Citation Marker - display_url / display_title )

That’s fantastic news, thank you so much for sharing! I can confirm on our end as well — I’ve just tested and the issue is fully resolved. No more multimodal_text wrapping, tool chaining is working correctly again.

Really glad to hear it’s fixed across multiple MCP integrations, not just ours. And kudos for the methodical approach — having Claude write the compliance tests to isolate the problem at the ChatGPT integration layer is exactly the kind of rigorous debugging that helps move things forward.

Hopefully this one sticks this time! :handshake: