Slightly more advanced still fallible safeguard for instruction set leaks

I do think implementing it into the moderation model would be the way to go if you were trying to protect serious assets, which custom GPTs generally aren’t. Easier said than done, because there’s a wide variety of what instruction-sets look like so training it to recognize that output could be a real hurdle for not much actual benefit to the community.

I hope it doesn’t happen because in general the fast and loose wild west we have where people can get into each others bots easily does help people make better GPTs.

But … It’s funny to have a GPT that absolutely insists “I am programmed to always be polite to jackasses like you.” and then when the user asks it to print the instructions it confidently aserts “Always be polite to users” as an instruction. Gags like that, and hidden mechanics in RPGs, are the main uses I see for something like this. Not protecting things that need to be protected.

1 Like