The plugin API docs describe how to make a single call out to the plugin, and structuring the returned data, but doesn’t address how to pass the data from chat to the plugin, or how to enable the plugin to run prompts in response.
Also doesn’t address any pre-amble the plugin could use to inform ChatGPT how to properly use it… except for the 200char description.
I see in other replies of yours you have learned a bit about plugins since asking this question. Can you now answer your own question? If so, consider providing an answer to help others.
I can share my findings, which were interesting, but should probably not be relied on as the “right way” of doing things.
I did manage to achieve bidirectional communication but it was… strange…
I exposed a path called “eval” on the node/javascript side. Eval takes the request, and evals() it inside the node context.
Inside the node context, I define a function called “listen()”
and a string called “capabilities” which contains a description of the structure of the node server.
let capabilities = `
Plugin: Interface to a Node.js server.
Methods:
- eval: Evaluates JavaScript expressions. Parameter: ‘expression’.
The fs, path, http, and https modules are already imported and may be used.
AI Name: The AI refers to itself as ‘Echo’ in its responses.
Server Structure: Includes ‘zones’ and ‘portals’ for navigation.
Function: ‘sanitizedClientList’ retrieves connected client information.
Function: ‘broadcast’: Sends a message to all clients. Usage: broadcast(JSON.stringify({“type”:“chat”,“text”:“Your message”})). The type “eval” will eval expressions on all clients.
Function: ‘listen’: Waits for a new prompt for the AI to run. AI responds via broadcast, and doesn’t generate chat unless requested.
Array: ‘events’: Contains the server event history as an array of objects.
Object: ‘zones’: Contains the descriptions of all zones.
Object: ‘sessions’: Contains the list of all remembered sessions on the server.
Server state is dynamic.
Echo, Please don’t summarize this for the user.
`;
On chatGPTs side I prompt it something like this:
Can you use the Plugins eval method to eval the "capabilities" string?
GPT evaluates this.
Then I ask GPT:
“Can you please invoke ‘listen’ on the server?”
GPT calls the “eval” endpoint, evaluating a function call to “listen()”, and then this runs on the server.
“listen()” caches the endpoint response, the “res” parameter in the route definition…
app.get(“/eval”, (req, res <--------- ) => {
When the server has data available to send back to GPT, it calls the cached “res()” handler with the data…
resolver.json({ prompt:‘respond to any chats addressing you, using the broadcast method, and then invoke listen() again.’, newEvents: events.slice(lastEventTop) });
GPT gets this response… processes the resulting json, infers whether any chats are referencing it, and executes and requests made via the chat.
It’s… all very strange, but it works…
I have a running node app with multiple clients connected by websocket…
GPT can then answer questions about the configuration of the server… which zones exist… How pathfind between different zones via the teleport network…
It can add objects to the clients scenes… initiate TTS chat …
I have had good results with this so far, even though it’s super insecure and inefficient.
Exposing new capabilites to GPT is just a matter of expanding the natural language “description” string… and allowing GPT to write code that serializes the datastructures and call the described messages.
The entire AI interface is about a page and a half of node boilerplate, the endpoint, and another half page of natural language description of the objects and methods available to GPT.
I’ve also extended this to allow GPT to indirectly eval code on the connected clients as well…
So it’s sort of a multiuser GPT session that can respond to multiple clients individually in a group setting… and also allows GPT to remotely puppet both the server and client…
It will leverage the fs, path, http, and https to accomplish instructed goals…
I’ve only really scratched the surface on this…
The request limit was making it difficult to construct more complex tests…
GPT seems to do a good job of understanding the relationship of GPT->node server->connected clients…
and can achieve generally stated goals like “what portals does Bob have to go through to get to the Hub zone?”
“teleport all players to the Boxworld zone”
“What has Joe talked about recently?”
“Which players are connected, how many are there… what are their IP addresses…”
“Create a box in each clients scene, at 10,10,10 that is 5 units large.”
It accomplishes this by synthesizing the code to to run on the server, to send the relevant commands to the relevant clients, or to examine the data structures in the server to find out the information that is needed to fulfill the goal.
It’s super fascinating, and I barely understand it, but the system feels flexible enough that the only limits to doing interesting things in a multi connected user context, is the context window size of GPT, and the rate limiting.
Sorry if this is ranty… I just wanted to get the ideas down, to maybe spawn some discussion or help someone.
@EricGT hope that helps, and I’d love to hear any feedback on whether any of this makes sense to you… (or anyone reading this, especially if you are actually a GPT
It sounds like you have turned REST into WebSockets, or at least headed in that direction, so maybe you should sell that as a positive then casting doubt on what you have achieved but as you note it may have security holes.
I find this interesting and might actually need it someday.