Hi all,
I’m building a ChatGPT App using the new MCP Apps framework and I’m trying to support this flow:
-
User pastes or uploads an image directly in ChatGPT.
-
The model interprets the image (vision), extracts structured data.
-
My app’s tool is called with both:
-
the extracted structured data (JSON), and
-
a handle to the original media file (e.g., URL or file ID) so my backend can store/process it.
-
In Custom GPTs, this is basically what GPT file Actions provide: the model fills JSON arguments and the Action receives attached files via short‑lived URLs in a single call.
Questions:
-
Is there an equivalent or planned feature for Apps/MCP tools, where an MCP tool can receive both structured arguments and user‑uploaded files from the chat context in one invocation?
-
If not today, is there any documented roadmap or recommended pattern (beyond building my own upload UI in a widget) to get parity with GPT file Actions for Apps?
Thanks!