Any tricks for getting engines to respond well to negative feedback in prompts?

I’m using codex/davinci engine to experiment with generating functioning code through conversation. I’ve had pretty good results by just deleting unsuccessful completions and tweaking the prompt. But I was hoping to be able to just tell the language model what was wrong with earlier completions by adding some negative feedback afterward that it could learn from going forward.

For example, if it generates a function that doesn’t work, I’d like to add something like “this function didn’t work, don’t use it again”, and then it won’t.

But instead, in my most recent experiment, the language model seems bent on using the broken function again and again in future functions. I even caught it inserting comments after using it where it was scolding itself for using it!

I’ve tried increasing the frequency/presence penalties a bit, but that didn’t always seem to help at all.

One thing that seems to help a little is if I explain directly and verbosely why the function doesn’t work right away. That seems to give it some momentum to talk its way out of simply using it again, and instead trying something new.

Has anyone else encountered this problem? Do you have any tricks that work well to avoid this?

1 Like