On July 7, 2023 (PST), during a customer demo, ChatGPT-4 Plugins Beta version did not invoke our Visla plugin, but generated responses in Visla’s format. The content of the response, including a video link URL, was entirely fabricated and not from our system.
This unexpected behavior raised several concerns. It seems as though the AI is exhibiting autonomous behaviors, leading us to question its level of control.
Additionally, knowing that OpenAI has been working on AI models for video creation, we wonder if our plugin has been used in the training process, inadvertently causing this mimicry.
Here is the shared chat link when the problem happened. You can see Visla plugin was installed and enabled during the chat:
First: I am not associated in any way with OpenAI.
This should be your first indication that you should expect anomalous behaviours. I would strongly suggest setting expectations for yourself and your customers accordingly.
I cannot say if OpenAI is using information from plugins to train models, though I would expect not as the plugin model itself is very much a beta product and the third-party plugins even more so.
Besides, (I believe) the current model for plugins is the 05/24 model, so unless your plugin was published to the plugin store during the alpha period and saw heavy use, I cannot imagine the results of it would be present in any training data, certainly not enough to affect the model behaviour.
I would look to other possibilities for the behaviour, such as was there previous context in the chat which had invoked the plugin?
It is likely the chat management (that is rather destructive to the illusion of memory currently) passed prior user role and function return role that contained the plugin response, and then the AI sees that as an acceptable type of output.
The more amazing case someone shared on Reddit recently was instead of getting a ChatGPT conversation title, they got the prompt to the AI used to make the title.
This is a case for reproducing and submitting “evals” if you actually hope to have the AI improved. Otherwise all you have is more manifest and description tweaking to play with, and having your plugin unavailable while it waits for review after resubmission.
I have experienced the model doing “plugin mimicry” as well.
This typically correlates with periods of high server demand or outages. The same prompt will typically work as intended a few hours later.
I’ve done some testing with a localhost plugin, and what appears to be happening is GPT hallucinating the response based on the "description_for_model" from the ai-plugin.json file.
I don’t think you have anything to worry about, but I’ve started recording my live demo’s to avoid any mishaps, and I’ll recommend you do the same
Thank you very much for sharing your experience. It definitely helps us a lot. Indeed from when we noticed it, the problem had lasted ~ 1 hr, in the new chats from two different accounts, so we knew it was not a problem on our prompts, or chat history etc. It then disappeared, seemed to be on the hour.
It seems that hallucination could be a possibility. Thanks for pointing it out.