Web Browser and AI Assistant Interoperability

Hello. I am considering opening an issue with the Web Incubator Community Group (WICG) with some ideas towards advancing interoperability between Web browsers and AI assistants. Note that anyone interested can open a new WICG issue and can participate in existing or create new Community Groups to enable technical discussion towards drafting documents.

With respect to Web browser and AI assistant interoperability, here are three main points thus far considered:

Point one: Should HTML5 metadata allow webpages to declare whether AI assistants can attach to them?

While the default value is envisioned as allowing attaching, scenarios for expressly blocking attaching include banking websites and credit card transaction pages.

This expressiveness could be provided with some metadata resembling:

<meta name="assistants" content="allow" />
<meta name="assistants" content="block" />

Point two: Could document markup, metadata, and/or schema, e.g., JSON-LD, enhance interoperability and/or enable new scenarios?

For example, this could allow webpages to declare what kind of content that they are, e.g., news articles or encyclopedia articles, and to refer to and describe things, or objects, occurring in the webpages.

Point three: JavaScript API.

  1. Should a webpage be able to detect the existence of, presence of, or capability for an available AI assistant to attach to it?
    1. If so, should this require a user permission?
  2. Should a webpage be able to obtain descriptive text strings, like user-agent strings, identifying those AI assistants present on a system and interacting with the Web browser?
    1. If so, should this require a user permission?
  3. Should a webpage be able to listen for an event when an AI assistant attaches itself to a page?
    1. If so, should this require a user permission?
    2. Should the event data include a user-agent strings for the AI assistant attaching?
    3. Should the event data describe the capabilities of the AI assistant?
  4. Should a website be able to install and uninstall a “plugin” for a browser-integrated AI assistant?
    1. If so, should this require a user permission?
    2. In theory, an installed “plugin” could be activated when and only when a user has navigated to that domain which the “plugin” was installed from.
  5. What are some of these capabilities that a bidirectional JavaScript API could make available? Example topics include:
    1. Sharing, clipboarding, dragging-and-dropping content from webpage to AI assistant chats.
    2. Sharing, clipboarding, dragging-and-dropping content from AI assistant chats into a webpage.
    3. Making available and describing JavaScript functions from a webpage for an attached AI assistant to consume and utilize.
    4. Resembling AutoGen, a webpage could interface with an AI assistant as an agent for multi-agent dialog scenarios.
  6. Should a webpage be able to listen for an event when an AI assistant detaches itself from a page?
    1. If so, should this require a user permission?

Is there any interest, here, in these technical topics? Beyond these three main points indicated, are there any other main points to consider? Is there any interest, here, in brainstorming to expand those listed potential API capabilities (point 5, above)? Thank you.

I don’t think this has a reasonable mandate.

Calls to AI services should be happening on the server, not the web client, imho, in the main because you can’t be handling credentials for third parties in the client.

Exciting use cases for client-side features include, but are not limited to, enabling conversational user interfaces for websites, webpages, and web applications.

Additionally, websites, webpages, and web applications will provide their own chatbots and end-users could engage with both browser-based and site-based dialog systems and, as a result of standardization, these dialog systems would also be able to interact with one another.

Standardization efforts could enhance both server-side and client-side scenarios.

I am not sure that I understand the third-party credentials scenarios that you are describing. If I do understand correctly, you are referring to technologies like OAuth2?