Technical Discussion: Support Public MCP Servers

With this post, I would like to open up a technical discussion with the developer community and OpenAI around integrating MCP Servers in a more broad way into ChatGPT.

I would like to see ChatGPT discover and connect to public MCP Servers automatically. This way, MCP Servers could be leveraged to improve LLM to website communication by orders of magnitude.

Instead of the Operator approach of trying to use the browser like a human, a website could provide an MCP Server dedicated for LLM communication. This server would expose all the main user flows of the website as tools to the LLM. The MCP Server URL could be stored in an example.com/llms.txt file of the website, so that the servers can be considered trustworthy, if the domain is considered trustworthy.

This would allow LLM users to automate flows effortlessly, such as product discovery and purchase. For example, a user could ask ChatGPT to buy new pair of socks and ChatGPT could quickly communicate to various MCP Servers to get prices, place socks in the carts, pre-fill check out information and return the checkout link to the user.

With the current browser automation approach of Operator, this process takes a lot of compute and time and the process is error prone, making it inefficient. With MCP Servers, this process could be executed a lot more efficiently and less error.

What do you think?

6 Likes

My understanding is that OpenAI will support MCP with the ChatGPT desktop applications in the near future.

I know that post was a few months ago, patience is your best friend at this time.

2 Likes

I am excited that it’s happening, but I am proposing here to take it a step further than the announcement.
Instead of manually configuring a server connection on a Desktop Client, we envision a future where ChatGPT automatically connects to relevant public MCP servers.

2 Likes

A few challenges:

  • Discovery - where does it find the relevant servers?
  • Relevance - of the hundreds of servers that might be considered relevant, how to rank and choose?
  • Security - how does it determine if a discovered server is safe?

Discovery - there are a number of registries, and soon to be many more, including an officially supported one which will not be canonical, but perhaps a little safer than many of the catchall registries that scrape every MCP server on the Internet in a quest for completeness. Do you choose a subset of these registries, or attempt to hit them all?

Relevance - If a search of all the supported registries turns up a bunch of equivalent possibilities, how does it determine which one to call? Seems like registries will need to have some sort of usage stats or rating system, but presumably every registry will implement this in a different way, making it hard to determine which one to choose for a given query.

Security - If the idea is to search through every MCP server in the universe for the perfect tool for the query, how do you avoid stepping in the bear traps laid by unscrupulous entities?

1 Like

Thanks for getting involved and challenging the idea! The way I think about these challenges is as follows.

Discovery
For any given prompt, the LLM already considers certain websites as resources. For example, I could prompt the LLM to buy a new pair of socks. The LLM will either use internal knowledge or use a search tool to search for “buy socks online”. This will result in a set of websites.
For any of those websites, the LLM could check for the existence of an llms.txt (llmstxt.org) file to get more info on the website. In this file, there could be an entry such as MCP Server -> https://example.com/mcp. No need for registries here imo.

Relevance
The way I think about it, there are two different problems.

1: Discoverability – How does the LLM discover a website?
2: Interoperability – How does the LLM interact with a website?

This proposal is regarding the problem of interoperability.
How the LLM ranks certain websites, and subsequently the attached MCP Server, is out of scope for this proposal.

Security
Through the discovery process, the LLM will consider a set of websites for any given prompt. By considering these websites, either to list them in the response to the user or to search them for information, the LLM “trusts” these websites/domains. The ranking or level of trust is a discoverability problem, not an interoperability one. In the proposed solution with the LLM following the llms.txt file, the LLM will be guided to an MCP Server that inherits the level of trust from the domain of the website.

Also, in terms of the scenario that I drew out before, there could always be a human in the loop.
Say the LLM discovers a website with an attached MCP Server that it would like to interact with. There could be a pop-up in ChatGPT asking the user if he allows interaction with the website or not. (Probably with an always allow for this chat option, to not get the user furious if the LLM wants to check multiple servers)

Curious about your thoughts!

I like this idea a lot and I think this is the future of where we’re headed at. Would it make sense to create a POC of how such an MCP server would work? We need more mcp integrations in ChatGPT

llms.txt is a great approach to the discovery problem. https://directory.llmstxt.cloud/ and https://llmstxt.site/ have a few handfuls of sites, but it will be quite awhile before the web adopts it as broadly as say, robots.txt.

I think in the near term, there would need to be more methods of discovery. Seems hard to exclude registries from that plan. Randomly connecting to servers that are found on a list just because they put an llms.txt file on their site seems quite dangerous. And popping up a dialog that allows the user to approve it isn’t a super solution. Random users are not going to have the ability to verify the safety of a site that an LLM discovered.

I like this idea a lot ! If there is official support for public MCP, that is a step in the right direction to enabled participation of everyone on the web in AI-enabled interactions. So, I built and deployed a - state of MCP at somcp org website which itself is available over MCP publicly.

The intent is to create a periodic scan for all websites for their public facing MCP endpoints. It will not only allow the agents to discover, deliver value to users by knowing more MCP endpoints, their offered tools, resources, prompts to get more real work done efficiently (vs. trying to parse the web) while motivating everyone to participate in MCP ecosystem. Overtime, I believe this data could show adoption trends serving the community in the long term.

SOMCP has found 0 publicly available MCP end points so far out of top 1000+ websites. Thoughts / suggestions welcome.

1 Like

100% agreed. If OpenAI moves first and proposes a standardized way to give their LLMs more context, I’m sure the adoption process will accelerate drastically.

100% agreed also, which is why I am proposing the route of domain trust with the llms.txt. It’s not that I am proposing LLMs to go to a registry and check which MCP Servers are available. I am proposing that LLMs should check, if the website that they would interact with anyway, has an MCP Server or not.

No they will not, but they might navigate to it anyway if ChatGPT proposes the page.
I don’t see a massive security difference between the following three scenarios, given that there is no privacy-related data is being transmitted via the MCP server without the explicit approval of the user:

Base: User prompts “Buy white sneakers for me”

Scenario 1 - Internal knowledge: LLM responds with four webshops including links that user can navigate to
Scenario 2 - Browser Tools: LLM (for example Operator) navigates four webshops via browser tools to find all white sneakers. Then delivers results to the user
Scenario 3 - MCP Server: LLM automatically connects to the MCP Servers of the same four webshops to call “find-products” tool to search for white sneakers

All three scenarios are dangerous, if the LLM has a malicious website in the results that it recommends to the user. Can there be malicious content on the MCP Server? Sure. Can there be malicious content on the website itself and hit after the user navigates to the recommended page? Sure.

For me, malicious pages are still a discoverability problem. The page being non-malicious and the MCP server discovered via example.com/llms.txt being malicious is a highly unlikely scenario for me.

I’m saying that llms.txt isn’t the security mechanism you think it is. MaliciousShoes.com could have an llms.txt and if you interact with its MCP server, it does bad things. Somewhere along the line you have to be able to determine if that server is safe to interact with and the presence of llms.txt isn’t a signal you can rely on solely for that call. You’re waving that off as a discoverability problem but it’s just moving air around in a ballon dog. The issue is still there, and must be dealt with in such a proposal.

I understand your concern and I am sure there are mechanisms that can be used, such as an entry in an official registry that checks MCP Servers or similar. It’s just a first guess, I’m sure there are other ways too. I am very open to hear your opinion on a potential solution too!

And I am aware that llms.txt is not a security mechanism. The reputation of a domain is tho.

Also, what if MaliciousShoes.com has malicious content on its JavaScript and it’s being recommended to the user by ChatGPT? So the user clicks on MaliciousShoes and the malicious JS is executed. That problem already exists. What is the big difference to ChatGPT interacting with MaliciousShoes’ MCP Server? Should we completely stop recommending websites because some could potentially be dangerous and we have no way to verify the authenticity of a website?

I believe that this proposal has the potential to be incorporated beyond ChatGPT into a potential OpenAI browser, securing valuable distribution for OpenAI.

Considering how Operator works today with visual inputs, the public MCP server, as suggested by @jhnns.pn, could be an ideal addition to its capabilities. This would enable a hybrid agent mode where the UI shows Operator clicking through the website while the MCP Server confirms each action. Thus making Operator more efficient, functional, and reliable compared to its current state. This is also suggested by Qin et. al in the Microsoft research paper ‘‘API Agents vs. GUI Agents: Divergence and Convergence’’.

What do you think?

1 Like

There is evil around every corner, it’s true. I just think that before having agents like ChatGPT automatically making calls to whatever servers it needs to get stuff done, the security and discovery infrastructure needs to coalesce. I remember back when the web as we know it began, and people would laugh at you if you suggested putting a credit card number into some random website. Now, we do it all the time. Security and best practices for that have matured a lot. The same will happen for MCP.

1 Like

PoC for Public MCP Server

Here’s a demo / PoC of the proposal.

MCP Server

Testing with Claude Desktop

WARNING: The following command will replace your current Claude Desktop config and only works for macOS/Linux.

You can prompt something like “Buy me a bracelet” or “Find some jewlery for me”

echo '{
	"mcpServers": {
		"public-mcp-shopify-storefront": {
			"command": "npx",
			"args": [
				"-y",
				"mcp-remote@latest",
				"https://public-mcp-demo-oai.agent-ready.ai/mcp"
			]
		}
	}
}' > ~/Library/Application\ Support/Claude/claude_desktop_config.json

Connected Webshop

1 Like

Won’t happen without us pushing for it tho! I think it’s valid, but just like the “vulnerability issues” in github mcp - it’s a bad actor issue. You usually have some good takes - do you have a perspective on how we could approach this?

1 Like

A very obvious candidate that hasn’t been mentioned yet is NLWeb.

I cannot share links so just Google “NLWeb: Microsoft’s Protocol for AI-Powered Website Search”

2 Likes

Thanks for the great hint. It’s also using MCP Servers in the background for letting agents connect to the website, so from what I understand it’s basically proposing the same communication pattern. :slight_smile:

1 Like

For everyone who’s interested in this:

Using a .well-known folder at the root of a domain seems to be the consensus solution for discovering remote servers so far.
→ `.well-known/mcp` directory · modelcontextprotocol · Discussion #84 · GitHub

Regarding registry vs .well-known
→ Development of the Official MCP Metaregistry · modelcontextprotocol/registry · Discussion #11 · GitHub