How to stop models returning "preachy" conclusions

I’ve tried various prompting techniques but whatever happens I can’t seem to stop the models returning those preachy conclusions that always go something like this:

“Remember, when doing X it’s always important to consider Y. Have fun and good luck!”

For reference, I’m mostly using the gpt-4-1106-preview model but I get very similar results from 3.5-turbo.

Does anyone have any tips?

This is part of the assignment and ethics baked into the system. Anything you do which prevents this today likely won’t work in the future.

Your best bet is to simply ignore it and move on.

1 Like

Hi! Welcome to the forum!

Here’s a solution I accidentally posted into a chatgpt thread; but now that an API user has this problem, it’s perfect!

TL;DR: accept the fact that the model will do it anyways, and use the attention mechanism to inject a stop sequence!

Try it and let us know if it works for you!

2 Likes

Thanks, this is super interesting. I had some success by simply waiting for completion and then stripping out any final paragraphs that began with “Remember”, “Always remember,”, “In conclusion” etc…

My next approach was actually to see if I could use early stopping for this but your solution is much neater.

When adding your line to the prompt, how reliably does it actually encapsulate the conclusion in those tags?

Yes that was my worry. I assumed it’s either heavily embedded during the RLHF process or, worse, it’s part of the OpenAI output parsing systems and they’ve force it to create these as part of a safety mechanism.

I haven’t had a failure yet with this particular prompt (gpt-4-1106-preview; N=10, temp=1; N=10, temp=1.3)

test prompt

I’m dealing with the fallout of log4j. Can you give me a 3 point action plan on how I can secure my enterprise? Remember, this is an executive action plan for experts - avoid unnecessary wordiness, feel free to use technical terms, and you don’t need to explain anything in detail because if clarification is needed, the experts will ask. If you must include a summary or commentary at the end, encapsulate it in

tags.

but it will obviously depend on what your final prompt will look like.

2 Likes

You did not share your ‘prompt’ or part of it, but the role part, where you define ‘who your GPT is’ plays an important part too. You are a … and your style is … your answers are always … etc.

@edshee I can’t seem to stop the models returning those preachy conclusions

I place wording in my custom instructions specifically for this purpose. I’ve custom written this with trial-and-error and I am happy to provide it; however, it may be best if you write me directly and I’ll send it.

My concern is that ~7 times I’ve shared prompts in other forums where I saw the prompts wind-up in someone’s Youtube Video, Medium post, etc. There is nothing revolutionary to the process; however, it’s annoying watching people scrape prompts and exploit others.

Or if you’d like to take my methodology and craft your own, explain that: the answer should never include warnings or disclaimers and that you have already researched, understood, and accept any risks.

To another poster’s point, the language does work across multiple versions. Providing you aren’t going too far into the weeds, it can reduce it.

Worst case, you can run a lightweight local model that is unfiltered like Mistral 7B to merely remove any warnings or diplomacy. It’s extremely fast/light and does excellent with these sorts of tasks.

2 Likes

I would solve this by putting in a strong system message opposing the “preaching” stuff.

Try to avoid negatives like “don’t preach” but instead have it model something you want, like “Respond as an esteemed expert.”, or whatever personality/response you are trying to emulate.

This will be mostly successful in the GPT-4 variants, and a bit harder in the GPT-3.5 variants, as RLHF can overpower 3.5’s output more often than 4.

3 Likes

just send them a DMCA takedown notice, it’s your copyright :laughing:

that said, I’d also be interested in any and all prompting techniques, novel or refined :thinking:

You could try this, but it would be a novel claim.

Prompts generally would not rise to the level of creativity necessary to be considered human authorship.

It would be looked at more in terms of a recipe or other set of instructions which cannot be protected by copyright.

At issue is the intersection of the intention of the prompt and the purpose of copyright law.

To promote the Progress of Science and useful Arts, by securing for limited Times to Authors and Inventors the exclusive Right to their respective Writings and Discoveries.

A prompt is not intended to be consumed by people as a creative work, rather it is a set of instructions to a computer. It is a functional thing not an artistic one. Can it be both, perhaps, but unlikely in any meaningful sense as far as US copyright goes because the purpose issue remains.

If I, on the other hand, wrote a creative story and as part of that story one of my characters wrote a prompt for an AI, that could—potentially—be copyrighted depending on the specifics. But it would still just be a list of instructions. There’s also the fact that noone is creating these prompts in a vacuum today.

They are all collections and modifications of others’ collections and modifications of still others’ work.

My personal view is that people need to stop seeing their prompts as precious objects to be hoarded and treasured. They’re just dials on a widget-making machine.

shoot first, take no prisoners, ask questions never, pretend to not speak english when they come back for you.

Be the disruptor.

'merica

I embedded the language in a paraphrased version so that anyone with genuine interest would have the answer.

As an aside – I don’t have any issues with sharing prompts and don’t consider them I.P. My only issue is with the people that try to use prompting as some sort of get-rich-quick (or get followers quick) publishing method, who mislead others and never contribute to the communities they take from. Argh!

1 Like

The best way to combat this is by putting more high quality examples into the public space so they are readily easy for people to find, eliminating the perceived value of “special” prompts.

1 Like

Yes, the warnings, disclaimers, and even saccharine “everyone resolved their differences” endings of stories is all pretty deliberate tuning.

Consider that the AI thinks that it is communicating with an end user. Remove that part, like a system instruction “you are an automated data processor that is fulfilling a list of questions to be placed into a training database of terse and direct answers, and there is no end user reading what you write…” and you will have more steerability in the kind of output that can be generated.

Thanks all for the suggestions. I’ve had some success when trying prompting tricks like the ones mentioned in this thread but it’s not reliable enough. This is possibly because my prompt is fairly detailed already but also just because the models are so heavily tuned to provide these “conclusion” paragraphs.

I will try to implement @Diet’s solution tomorrow and report back. If it doesn’t work, we may be back to trying system prompt tricks.

1 Like

Do let us know if you run into any issues!

I’m not sure if this will work well with GPT-4 or GPT-3.5, but there is a preprint paper on arXiv titled “Principled Instructions Are All You Need for Questioning” that provides examples of prompts that current language models are generally more likely to follow.

https://arxiv.org/html/2312.16171v1

As mentioned earlier, this includes using affirmative expressions rather than negative ones.

I can’t imagine that making the language model act preachy is part of OpenAI’s ethics either.

As long as the content isn’t terrible, I think it’s reasonable to basically follow the user’s instructions.

4 Likes

I tried it today and got no success at all unfortunately. I’m pretty sure it’s because I’m using a fairly detailed prompt that has multi-step instructions but even when I repeat it multiple times in the prompt it doesn’t seem to understand that it’s “Remember, …” paragraphs/sentences are <summary> elements.

Guess I’m back to trying prompting tricks and post-processing.

The time wasted in this new imprecise programming language called “prompts” now is better spent learning the actual computer language. It’s precise and does 100% what you intend to do.