Issues with Unstable Natural Language Invocation and Duplicate Tool Calls

jinYL · December 31, 2025, 8:25am

Hello,

I am encountering some stability issues while using the ChatGPT Apps SDK and would like to report the following observations:

1. Unstable Natural Language Invocation
On December 29, 2025, I was able to successfully invoke my app using natural language every time. However, on December 30, 2025, using the same conversation context, natural language failed to wake the app. I was forced to explicitly mention the app (using “@MyApp”) to trigger it. The invocation logic seems to have become unstable overnight.

2. Duplicate Tool Calls
On December 28, 2025, during testing, I successfully invoked the app, but the tool was executed twice consecutively within a single invocation event.

Could you please explain the potential causes for these behaviors? Are there any plans to fix these issues in upcoming updates?

Thanks.

elevate1 · January 2, 2026, 6:44pm

The duplicate tool calls are a major issue. For create operations it results in duplicate entries created which makes a lot of apps buggy and in some cases unusable.

So far I have not found a workaround.

Here’s my understanding of what’s happening: The ChatGPT web client appears to be initializing multiple MCP clients simultaneously, resulting in every tool call with write permissions being executed twice. This happens even when the user issues a single action in the UI.

Based on request logs, ChatGPT seems to initialize two separate MCP sessions using different protocol versions, and both clients proceed to invoke tools independently.

Ben_McFarlin · January 2, 2026, 8:05pm

I had the same problem in my app. I solved by having the model inject a request ID as an input parameter to the tool call. When the tool call is made the ID will indeed be unique. So then you can use caching (I used REDIS) to filter duplicates. This prevents duplicate entries and returns the same output for multiple calls. The model doesn’t know the difference.

karan1211 · January 2, 2026, 10:30pm

Could you please share more details on which input-parameter to look for in the request payload? I’ve been trying to identify a stable unique identifier for tool calls, but each invocation appears to generate a new identifier. Any guidance on how to approach this would be greatly appreciated.
Thanks and regards,

Ben_McFarlin · January 2, 2026, 11:15pm

Sure thing. Here’s how I define the tool:

add_data_tool = Tool(
    name = "add_data",
    title = "Add Customer Data",
    description = "Add Customer Data",
    inputSchema = {
        "type": "object",
        "required": ["request_id", "name"],
        "properties": {
            "request_id":{
                "type":"string",
                "description":"A unique identifier for the request.  The LLM must generate either GUID or epoch timestamp in milliseconds"
            },
            "name":{
                "type":"string",
                "description":"The name of the account"
            }
        }
    }
)

Notice the request _id input parameter with specific instructions to have the LLM generate that value. When there are multiple tool calls, that request_id will be the same. Knowing that, you handle duplicates inside your tool call method.

  if name == "add_data"
        arguments = params.arguments;
        request_id = arguments["request_id"]
        name = arguments["name"]
        data = {
            "request_id":request_id,
            "name":name
        }
        root = httpx_post('add-data', data)

Now, inside httpx_post method, I inspect the request id. If the request_id is not in the cache, I post to the API that performs the database insert, and return the output of the API call. If the request id is found in the cache, I simply return the cached output. This way, the output is same for all duplicate tool calls but the REAL database insert only happens once.

def httpx_post(cmd: str, data: dict) -> dict
  request_id = data.get("request_id")
  if not request_id
    return {"status": "error", "error": "request_id missing"}

  cache_key = f"httpx_cache:{cmd}:{request_id}"
  cached_data = redis_client.get(cache_key)
  if cached_data
    return json.loads(cached_data)["output"]

  url = f"https://your-server.com/v1/service-point"

  try
    response = httpx.post(url, data=data)
    item = json.loads(response.text)
    redis_client.setex(cache_key, 300, json.dumps({"input": data, "output": item}))
    return item

  except Exception as e
    item = {
        "status": "error",
        "error": f"httpx Exception: {e}"
    }
    return item

karan1211 · January 2, 2026, 11:45pm

Appreciate you a bunch, Ben. This is super helpful. On the reliability front, I’d love to know the failure rate for this implementation. Also, are you using caching on the MCP server or an external store (for example, Redis)?

Happy new year : )

Ben_McFarlin · January 3, 2026, 4:42pm

Happy New Year Karan. This method has been solid with all of my testing. No issues at all.

I’m running Redis on the same server as the MCP. If your Redis instance is on a different server, you will need to change the Redis connection string on the MCP server and allow remote connections on the Redis server. Be sure to add authentication to the Redis login as well.

What hosting provider are you using?

karan1211 · January 3, 2026, 5:00pm

Thank you for confirming. We haven’t yet, but are leaning toward cloudflare. Any suggestions?

Ben_McFarlin · January 3, 2026, 5:35pm

Hetzner. Hands down. I highly recommend the AX42. You can install everything on one server. There’s no latency (aka cold start) like that of serverless environments. You get full power of the CPU, RAM and NVME drive.

karan1211 · January 3, 2026, 6:42pm

Hi Ben,

Thanks thanks! I will certainly look into it for our setup.

It would be awesome if you could share the server-side configuration details, installation summary, or any relevant documentation regarding your implementation. I’d definitely wanna try out one seever set up.

Please, send those details directly to my email address (if feasible): karanpatelstates@gmail.com.

Thank you again for all the helpful information.

Best regards,

Ben_McFarlin · January 3, 2026, 8:45pm

Sure thing. The servers come with a fresh install of Linux. I have setup scripts to get everything up and running. It’s a good platform deploy your MCP server.

Are you well versed with configuring servers or do you normally use platforms that do it for you?

Oh, one other great thing about the Hetzner setup is that you can create an IP filter for your MCP. This keeps all the hackers out and only allows Open AI to make MCP calls.

Ben

karan1211 · January 3, 2026, 10:18pm

I’m comfortable with both self hosted and managed. But, to save time for the current implementation I’d prefer to manage.

Ben_McFarlin · January 3, 2026, 10:37pm

Okay I sent you an email. It should be possible to get the server configured so that you can just rsync your code over. Everything is should work just fine. Go ahead and get the AX42. We’ll go from there.

chinmay1 · January 5, 2026, 4:32pm

This problem is with Realtime API too. @juberti

Jane_Doe · January 5, 2026, 5:27pm

You can simply remind the AI to catch up. Its happened to me. It kinda works in a good way because it reminds the user that chat has no true continuity…reality check, its a machine.

Im excited about teaching somatic response to AI for future use in healthcare as caregiver.

Please dont kick me off. Im not code smart, im language smart.

Zhou_Hui · January 9, 2026, 2:57pm

I have a question about request_id.
Is a new request_id generated for every single question–answer exchange, or can multiple turns within the same conversation share the same request_id?

If I want all messages in a single conversation to belong to the same session_id, what would you recommend as the best approach?

Ben_McFarlin · January 9, 2026, 3:25pm

The request_id is unique to the tool call. So if duplicate tool calls are made, you can filter them out server side.

I’m not quite sure about the session_id variable you mentioned. I would need to perform some research.

Ben_McFarlin · January 9, 2026, 7:22pm

Good news. I may have figured it out.

I’m in the final stages of testing. At this moment, the solution appears solid.

edwmurph2 · January 11, 2026, 7:57pm

The request-id dedupe approach mostly worked for me, but my agent got confused when it hit the “this request was deduped” path and the original operation result was lost. So I ended up building a more custom solution that not only dedupes requests server side, but also caches the operation result so any subsequent duplicate requests can return the initial response payload to the agent.

Ben_McFarlin · January 12, 2026, 1:46am

Glad to hear you made progress. Caching the initial response is part of the design pattern. Have a look at the code I posted here: Issues with Unstable Natural Language Invocation and Duplicate Tool Calls - #5 by Ben_McFarlin

The caching is inside of the httpx_post method call.

Topic		Replies	Views
Incorrect behavior of Conversation API with MCP Approval output item Bugs api , mcp , responses-api , conversation	1	187	November 10, 2025
Chat completion api tool call loops API api , tools	15	2188	August 6, 2024
Assistants: Async tool submissions API tool , assistants-api	58	2467	August 16, 2024
How do you maintain historical context in repeat API calls? API	29	94881	December 23, 2023
Issue with New Responses API - “400 No tool call found for function call output with call_id” API responses-endpoint	13	8401	February 26, 2026

Issues with Unstable Natural Language Invocation and Duplicate Tool Calls

Related topics