How can you protect your GPT?

curt.kennedy · November 26, 2023, 9:22pm

The only realistic way to prevent your system from being jailbreaked is to actively filter and intercept each request, and reject requests that appear as jailbreaks.

You would use classifiers, keyword matching, etc.

Another approach which makes your LLM jailbreak-proof but less interesting, is using what I call “proxy prompts”. Here you map a prompt to a “safe prompt” using embeddings. This insulates the user from the LLM.

More talking over here.

Topic		Replies	Views
Basic safeguard against instruction set leaks Prompting gpt-4 , chatgpt , bug , prompt-engineering , gpts	46	8288	March 4, 2024
There's No Way to Protect Custom GPT Instructions Community custom-gpt	54	12758	April 19, 2024
How to avoid GPTs give out it's instruction? Prompting gpt-4	29	7243	June 2, 2025
How to Avoid the Prompts/Instructions, Knowledge base, Tools be Accessed by End Users? Prompting gpt-4 , chatgpt , hacking	28	10275	April 25, 2024
Slightly more advanced still fallible safeguard for instruction set leaks GPT builders gpt-4 , chatgpt , fine-tuning , custom-instructions , custom-gpt	17	3328	December 22, 2024

How can you protect your GPT?

Related topics