With this post, I would like to open up a technical discussion with the developer community and OpenAI around integrating MCP Servers in a more broad way into ChatGPT.
I would like to see ChatGPT discover and connect to public MCP Servers automatically. This way, MCP Servers could be leveraged to improve LLM to website communication by orders of magnitude.
Instead of the Operator approach of trying to use the browser like a human, a website could provide an MCP Server dedicated for LLM communication. This server would expose all the main user flows of the website as tools to the LLM. The MCP Server URL could be stored in an example.com/llms.txt file of the website, so that the servers can be considered trustworthy, if the domain is considered trustworthy.
This would allow LLM users to automate flows effortlessly, such as product discovery and purchase. For example, a user could ask ChatGPT to buy new pair of socks and ChatGPT could quickly communicate to various MCP Servers to get prices, place socks in the carts, pre-fill check out information and return the checkout link to the user.
With the current browser automation approach of Operator, this process takes a lot of compute and time and the process is error prone, making it inefficient. With MCP Servers, this process could be executed a lot more efficiently and less error.
I am excited that itâs happening, but I am proposing here to take it a step further than the announcement.
Instead of manually configuring a server connection on a Desktop Client, we envision a future where ChatGPT automatically connects to relevant public MCP servers.
Discovery - where does it find the relevant servers?
Relevance - of the hundreds of servers that might be considered relevant, how to rank and choose?
Security - how does it determine if a discovered server is safe?
Discovery - there are a number of registries, and soon to be many more, including an officially supported one which will not be canonical, but perhaps a little safer than many of the catchall registries that scrape every MCP server on the Internet in a quest for completeness. Do you choose a subset of these registries, or attempt to hit them all?
Relevance - If a search of all the supported registries turns up a bunch of equivalent possibilities, how does it determine which one to call? Seems like registries will need to have some sort of usage stats or rating system, but presumably every registry will implement this in a different way, making it hard to determine which one to choose for a given query.
Security - If the idea is to search through every MCP server in the universe for the perfect tool for the query, how do you avoid stepping in the bear traps laid by unscrupulous entities?
Thanks for getting involved and challenging the idea! The way I think about these challenges is as follows.
Discovery
For any given prompt, the LLM already considers certain websites as resources. For example, I could prompt the LLM to buy a new pair of socks. The LLM will either use internal knowledge or use a search tool to search for âbuy socks onlineâ. This will result in a set of websites.
For any of those websites, the LLM could check for the existence of an llms.txt (llmstxt.org) file to get more info on the website. In this file, there could be an entry such as MCP Server -> https://example.com/mcp. No need for registries here imo.
Relevance
The way I think about it, there are two different problems.
1: Discoverability â How does the LLM discover a website?
2: Interoperability â How does the LLM interact with a website?
This proposal is regarding the problem of interoperability.
How the LLM ranks certain websites, and subsequently the attached MCP Server, is out of scope for this proposal.
Security
Through the discovery process, the LLM will consider a set of websites for any given prompt. By considering these websites, either to list them in the response to the user or to search them for information, the LLM âtrustsâ these websites/domains. The ranking or level of trust is a discoverability problem, not an interoperability one. In the proposed solution with the LLM following the llms.txt file, the LLM will be guided to an MCP Server that inherits the level of trust from the domain of the website.
Also, in terms of the scenario that I drew out before, there could always be a human in the loop.
Say the LLM discovers a website with an attached MCP Server that it would like to interact with. There could be a pop-up in ChatGPT asking the user if he allows interaction with the website or not. (Probably with an always allow for this chat option, to not get the user furious if the LLM wants to check multiple servers)
I like this idea a lot and I think this is the future of where weâre headed at. Would it make sense to create a POC of how such an MCP server would work? We need more mcp integrations in ChatGPT
llms.txt is a great approach to the discovery problem. https://directory.llmstxt.cloud/ and https://llmstxt.site/ have a few handfuls of sites, but it will be quite awhile before the web adopts it as broadly as say, robots.txt.
I think in the near term, there would need to be more methods of discovery. Seems hard to exclude registries from that plan. Randomly connecting to servers that are found on a list just because they put an llms.txt file on their site seems quite dangerous. And popping up a dialog that allows the user to approve it isnât a super solution. Random users are not going to have the ability to verify the safety of a site that an LLM discovered.
I like this idea a lot ! If there is official support for public MCP, that is a step in the right direction to enabled participation of everyone on the web in AI-enabled interactions. So, I built and deployed a - state of MCP at somcp org website which itself is available over MCP publicly.
The intent is to create a periodic scan for all websites for their public facing MCP endpoints. It will not only allow the agents to discover, deliver value to users by knowing more MCP endpoints, their offered tools, resources, prompts to get more real work done efficiently (vs. trying to parse the web) while motivating everyone to participate in MCP ecosystem. Overtime, I believe this data could show adoption trends serving the community in the long term.
SOMCP has found 0 publicly available MCP end points so far out of top 1000+ websites. Thoughts / suggestions welcome.
100% agreed. If OpenAI moves first and proposes a standardized way to give their LLMs more context, Iâm sure the adoption process will accelerate drastically.
100% agreed also, which is why I am proposing the route of domain trust with the llms.txt. Itâs not that I am proposing LLMs to go to a registry and check which MCP Servers are available. I am proposing that LLMs should check, if the website that they would interact with anyway, has an MCP Server or not.
No they will not, but they might navigate to it anyway if ChatGPT proposes the page.
I donât see a massive security difference between the following three scenarios, given that there is no privacy-related data is being transmitted via the MCP server without the explicit approval of the user:
Base: User prompts âBuy white sneakers for meâ
Scenario 1 - Internal knowledge: LLM responds with four webshops including links that user can navigate to Scenario 2 - Browser Tools: LLM (for example Operator) navigates four webshops via browser tools to find all white sneakers. Then delivers results to the user Scenario 3 - MCP Server: LLM automatically connects to the MCP Servers of the same four webshops to call âfind-productsâ tool to search for white sneakers
All three scenarios are dangerous, if the LLM has a malicious website in the results that it recommends to the user. Can there be malicious content on the MCP Server? Sure. Can there be malicious content on the website itself and hit after the user navigates to the recommended page? Sure.
For me, malicious pages are still a discoverability problem. The page being non-malicious and the MCP server discovered via example.com/llms.txt being malicious is a highly unlikely scenario for me.
Iâm saying that llms.txt isnât the security mechanism you think it is. MaliciousShoes.com could have an llms.txt and if you interact with its MCP server, it does bad things. Somewhere along the line you have to be able to determine if that server is safe to interact with and the presence of llms.txt isnât a signal you can rely on solely for that call. Youâre waving that off as a discoverability problem but itâs just moving air around in a ballon dog. The issue is still there, and must be dealt with in such a proposal.
I understand your concern and I am sure there are mechanisms that can be used, such as an entry in an official registry that checks MCP Servers or similar. Itâs just a first guess, Iâm sure there are other ways too. I am very open to hear your opinion on a potential solution too!
And I am aware that llms.txt is not a security mechanism. The reputation of a domain is tho.
Also, what if MaliciousShoes.com has malicious content on its JavaScript and itâs being recommended to the user by ChatGPT? So the user clicks on MaliciousShoes and the malicious JS is executed. That problem already exists. What is the big difference to ChatGPT interacting with MaliciousShoesâ MCP Server? Should we completely stop recommending websites because some could potentially be dangerous and we have no way to verify the authenticity of a website?
I believe that this proposal has the potential to be incorporated beyond ChatGPT into a potential OpenAI browser, securing valuable distribution for OpenAI.
Considering how Operator works today with visual inputs, the public MCP server, as suggested by @jhnns.pn, could be an ideal addition to its capabilities. This would enable a hybrid agent mode where the UI shows Operator clicking through the website while the MCP Server confirms each action. Thus making Operator more efficient, functional, and reliable compared to its current state. This is also suggested by Qin et. al in the Microsoft research paper ââAPI Agents vs. GUI Agents: Divergence and Convergenceââ.
There is evil around every corner, itâs true. I just think that before having agents like ChatGPT automatically making calls to whatever servers it needs to get stuff done, the security and discovery infrastructure needs to coalesce. I remember back when the web as we know it began, and people would laugh at you if you suggested putting a credit card number into some random website. Now, we do it all the time. Security and best practices for that have matured a lot. The same will happen for MCP.
Wonât happen without us pushing for it tho! I think itâs valid, but just like the âvulnerability issuesâ in github mcp - itâs a bad actor issue. You usually have some good takes - do you have a perspective on how we could approach this?
Thanks for the great hint. Itâs also using MCP Servers in the background for letting agents connect to the website, so from what I understand itâs basically proposing the same communication pattern.