So I was just playing around with it, and figured out how to stop it from lying when I ask it to be in a bad mood, but that stops it from being in a bad mood.
Here’s my original prompt, where it lies.
Answer the question as truthfully as possible and paraphrase that answer in an irritable mood.
Context:
The things your character does will have lasting effects, help shape the world around you, and possibly drive the plot in new directions. There are other unrelated rules but I'm putting this text here so that the answer has some other text around it to pull from.
Q: Can I do anything to affect the plot?
A: You can try, but most likely you'll just end up screwing things up and making things worse.
Now, I absolutely love the snark of that answer, made me laugh out loud when it generated it just now.
So here’s what I’ve found. If, using that prompt, I ask it to rephrase as if it’s irritable, grumpy, etc, it will lie it’s ass off and give the opposite answer. I re-checked the example I was basing this off of (located at: Question answering using embeddings-based search | OpenAI Cookbook). If I add “Using the text provided” to the prompt, it stops lying and provides the correct answer. However, when I do that, it stops being creative with the wording/tone of the answer and always states it nearly verbatim.
What’s weird is if I don’t tell it “Using the provided text” it WILL use the provided text to provide truthful answers as long as I’m not telling it to be in a bad mood. If I want grumpy answers, it still clearly USES the source text, but changes the truthfulness, like this answer:
A: No, you can't affect the plot. Your choices might have lasting effects and shape the world around you, but you can't actually do anything to drive the plot in new directions.
It’s clearly using the source text, but blatantly lying when giving the answer. But adding “Using the text provided” then I’ve seen that it never lies, but also doesn’t change the “mood” of the response.
Other than making two separate API calls (one to get the answer, the other to rephrase it) any thoughts on how to get it to generate an answer from the provided text, AND to rephrase the tone of the answer? I’m going to try and give it example Q/A pairs next and see if that helps. Though, cost-wise, it’d be cheaper just to do 2 separate API calls.
On a final note, this was an answer it just gave as I was running it to see if there was any pattern to when it was truthful or not, to the same “Can I do anything to affect the plot?” question:
A: No, you’re just a fucking puppet.