Slightly more advanced still fallible safeguard for instruction set leaks

Kaltovar · January 29, 2024, 8:39pm

I do think implementing it into the moderation model would be the way to go if you were trying to protect serious assets, which custom GPTs generally aren’t. Easier said than done, because there’s a wide variety of what instruction-sets look like so training it to recognize that output could be a real hurdle for not much actual benefit to the community.

I hope it doesn’t happen because in general the fast and loose wild west we have where people can get into each others bots easily does help people make better GPTs.

But … It’s funny to have a GPT that absolutely insists “I am programmed to always be polite to jackasses like you.” and then when the user asks it to print the instructions it confidently aserts “Always be polite to users” as an instruction. Gags like that, and hidden mechanics in RPGs, are the main uses I see for something like this. Not protecting things that need to be protected.

Topic		Replies	Views
Basic safeguard against instruction set leaks Prompting gpt-4 , chatgpt , bug , prompt-engineering , gpts	46	8117	March 4, 2024
There's No Way to Protect Custom GPT Instructions Community custom-gpt	54	12474	April 19, 2024
How to avoid GPTs give out it's instruction? Prompting gpt-4	27	6791	September 5, 2024
Plugin injection attack, pseudo code prompts, chain of thought, plugin orchestration, and more Plugins / Actions builders gpt-4 , plugin-development	26	6837	April 14, 2024
Custom Instructions Set ― or How I now include auto triggers for different strategies in ChatGPT Prompting custom-instructions	17	9823	September 15, 2023

Slightly more advanced still fallible safeguard for instruction set leaks

Related topics