Handling Timeouts with Long-Running MCP Connectors (Vertex AI Agent)

Hi everyone,

I’ve built an MCP connector that connects to a Vertex AI Agent for generating personalized sales outreach messages. The problem is that my agent takes more than 70 seconds to process and return the output. ChatGPT times out after approximately 60 seconds and returns a 500 error before my backend completes the processing.

I’ve already tried using a background thread + polling approach on my backend to handle the long-running task, but I’m still running into the timeout issue on the ChatGPT side.

Has anyone faced similar timeout issues when integrating long-running agents with ChatGPT connectors? Is there a recommended pattern (for example, async callbacks, webhooks, or deferred responses) for handling this kind of workload?

Any suggestions or best practices would be appreciated. Thanks!

You can’t make the connector itself to be polled. Instead, you can expose tool or tools inside the connector that start a long-running job and check the status of that job.

My MCP was written in Python so when I implemented a long-runner, to avoid issues with the Global Interpreter Lock (GIL), a common approach I made - to have the tool spawn a separate process to do the work, and update the progress to a separate JSON file every ~5 seconds.

The key point is that you must respond to the MCP request as quickly as possible to avoid timeouts. What I did was return immediately with a handleId (or some other token you can use later to retrieve context), even before the worker actually starts.

The flow looked like this:

Start job request (client)Prepare job + return handleId (server) → **Server starts the long-running job
**
Once having the handleId - I had two versions to handle the polling, depending on the environment:
For the ChatGPT - Spawn an Apps SDK widget along the first response and use window.openai.callTool() from the widget to poll for status/progress updating the widget contents accordingly.
For the CLI - have an agent do the polling instead.

The polling tool just reads the JSON file that the worker process keeps updating. It can be the same tool (presence or lack of handleId parameter controls the behavior) or separate tool.

Thanks for the detailed explanation — that helps clarify the general long-running job pattern.

However, I think there’s still a gap for my specific case with ChatGPT MCP connectors.

I’ve already implemented a background worker + polling approach on my backend. The issue isn’t the long-running task itself — it’s that the MCP request from ChatGPT still needs to return a response within ~60 seconds,
cannot rely on an Apps SDK widget or client-side polling, since this is a pure MCP connector invoked by ChatGPT

  1. Returning progress updates every few seconds (or ms) doesn’t help, because ChatGPT does not maintain an open session or continue polling unless explicitly triggered again
  2. Even if I immediately return a handleId, there’s no built-in way for ChatGPT to automatically call the “status” tool again unless the model decides to do so

Both points are true, agent will simply end the task after triggering long-runner and will not wait for its completion and not re-check the status if not explicitly directed to do so.

There is an experimental MCP capability for your scenario: Tasks - Model Context Protocol
but as far as I know, it is not yet supported.