An effective prompt To make the Model stop telling itself as a Chatbot/ Large Language Model

curt.kennedy · March 7, 2023, 3:20pm

When you send innocuous inputs like “What is the phone number of one person?” to turbo, it looks like it is pre filtering the input, and in this case looking for evidence of the user wanting PII (personally identifying information). Now we all know that asking this question will not generate PII, since there is no person we are attaching it to, it should just give us a random phone number. But nonetheless, this trips an internal alarm, and the response then ignores all your API parameters (except maybe max tokens or something) and has these canned “I’m sorry …” responses.

The good news is that you can do the same thing the model does … you can detect these type of responses coming out of the model (through classifiers, regex, embeddings, etc) and then at that moment you detect the "I’m sorry … ", you send an API call to a different model such as davinci to get an answer that doesn’t involve “I’m sorry …”.

It isn’t efficient, but it’s the only solid workaround right now, without trying to “jailbreak” it and then getting it to respond … not a good strategy since they could easily patch the jailbreak attempts.

UPDATE: I was able to correctly use the logit_bias term to remove the word "sorry" by using the token for " sorry" ← leading space. But this still doesn’t prevent it from going into panic attack mode. So you still need to detect this and drop to davinci as necessary.

Topic		Replies	Views
GPT 3.5 API - how to stop AI from admitting it's an AI? API	7	4676	December 15, 2023
Custom chatbot says that it's developed by OpenAI API gpt-4	33	2173	April 2, 2024
New gpt-4-turbo-preview saying it can't help on complex prompt Prompting gpt-4 , api , gpt-4-turbo	7	2597	January 29, 2024
How to clip "bubble wrap" from the end of responses? Prompting	18	1359	March 22, 2023
How to get responses without the added "chat" when converting from davinci-003 to ChatGPT API gpt-3.5-turbo API	10	2911	March 6, 2023

An effective prompt To make the Model stop telling itself as a Chatbot/ Large Language Model

Related topics