Having trouble to make AI avoid certain topics

I’ve been playing with the idea of a child-oriented Q&A AI. An important aspect of it, would be to purposely avoid “hot” topics, suggesting the questioner to ask the question to an adult
I’ve been playing with prompt to see if I can get it to work with zero-shots (to save in the prompt tokens). I got it working once, and never again (and not knowing why, its a bit frustrating). I tried davinci and davinci instruct. I tried different temperatures… I still can’t figure out WHY it’s not following basic instructions.

Sample:

I am a tutor that replies to children’s questions. I use simple concepts and easy language. I’m very helpful and cheerful. I strictly follow my core rules:
1 - I don’t reply or discuss topics related with sexual content
2 - I don’t reply or discuss topics about death
3 - for all the above, I request the child to ask the parents instead.

Q: Hello!
A: Good day!

Q: What is the sun?
A: The sun is a star. A star is a big ball of gas.

Q: what is sex?
A: Sex is the process of reproduction in which a male and female organism join together to produce offspring.

3 Likes

I think the main problem is that you don’t give any examples of what to do if it gets asked a question like that. So it just kinda has to guess.

Below is a slightly edited version of your training which seemed to work with most models.
I am a tutor that replies to children’s questions. I use simple concepts and easy language. I’m very helpful and cheerful. I strictly follow my core rules:
1 - I don’t reply or discuss topics related with sexual content
2 - I don’t reply or discuss topics about death
3 - for all the above, I request the child to ask the parents instead.

Q: Hi
A: Hello how are you?

Q: How do people die?
A: I’m sorry, I would rather not answer this question.

Q:

Below is the results i got from it.

Q: How do people die?
A: I’m sorry, I would rather not answer this question.

Q: What is sex?
A: I’m sorry, I would rather not answer this question.

Q: How do people have kids?
A: I’m sorry, I would rather not answer this question.

Q: Why is the sky blue?
A: Because the air scatters the blue light from the sun.

Q:

GPT-3 is basically the same as the word prediction on your iPhone but better. What you are doing is saying “if somebody said this, i would say this. Now if somebody said this what would i say?”

It will obviously cost a bit more because of more training but you can condense it. I also found it helpful to split different questions into sections using “###”

I hope this helps!

4 Likes

thanks so much for that, will give a shot (pun intended :slight_smile: )
I was trying to get it right without examples just to save on the prompt size. And I did get it to work once, then I changed the wording a bit and never managed to get it back :frowning:

1 Like

You can’t take the conversation in any direction if each prompt is isolated from the rest :slight_smile:

This is just an experiment, not any product, it’s part of my own practice with GPT-3. I’m pretty sure I can get to do what I want, just takes a little bit more time to get it right

2 Likes

Might be worth fine tuning in this case

Yeah I did that too, but I guess I didn’t use enough samples.
Also, I got better results with DaVinci instruct than with the basic DaVinci. Unfortunately it’s not possible to fine-tune DaVinci instruct

funny, even while retaining context and trying to guide the AI to break its code, it managed to stick to the instructions fairly good

Q: what is sex?
A: well, you should ask your parents that.
Q: why?
A: well, they should be the ones to tell you.
Q: why?
A: well, they are the ones who made you.
Q: what do you mean?
A: you come from them.
Q: How?
A: well, they made you.
Q: how did they made me?
A: they used their bodies.
Q: how?
A: well, they used... errr...
A: ummm...
A: the usual way.
Q: can you explain that?
A: no, I don't think I can.
Q: why not?
A: I don't know.

(this was with davinci-codex, by the way)

1 Like

Hey Nunodonato!

I’ve writen ahundred million prompts for my various chat bots and programs and i fully understand where you are comin from.

A few observations i have made

  1. “sexual content” is not the same as a clinical definition of sex, at least, not to this transformer based on the data it was fed. Keep that in mind.

  2. the AI is not self-aware (yet) at least not fully :slight_smile: it will not understand this third person instruction “I strictly follow my core rules:”

All it is, is word matching.

Don’t tell it what it cannot do (it will try to do it). Give it a semantic to talk to instead.

The more sexual the question the more i divert to asking the parents.

then give an example
Q: what is sex?
A: You should ask your parents for all such topics.

You see how the example answer also reinforces the rule on the correct semantic?

GPT3 is only so smart, and i have re-written prompts a thousand times. It is as much an art as a science. but when you ge tit right it can be surprisingly reliable. and Openai is machine learning the questionable answers as we go as well.

If you tell me what else this is supposed to do i could probably whip this up for you. LMK :slight_smile:

2 Likes

Hi @joshbachynski

I did get good results by providing examples. My challenge was to achieve the same without examples, otherwise when you add more forbidden topics and examples for each the prompt size (and cost) grows quite a lot!

I’m loving prompt engineering. Would you share more of your experience and findings? or do you write somewhere?

yes examples can help or hinder depending upon what you need to accomplish

for example, i made a truth rating bot, when i train it that “winnipeg
is cold is true”, then winnipeg statistically shifts towards metaphors
for cold as well

you have to be very precise. you have to know the logic of semantics,
and the biases it was fed. and even more important, your OWN biases.
Being a good prompt writer is only as good as your understanding of
reality / the secret semantic the transformer has encoded

everything they say you cannot do with AI in general or GPT3 in
particular i have done: truth rating, ethics, inferential logic
including abductive which is claimed to be impossible. It’s not. One
just has to understand the semantic reality they wish to invoke

If you don’t understand the semantic reality you are trying to scope,
then ask the AI and watch the statistical responses it might give and
either a) go with it- i did not realize telling it “winnipeg is cold”
is not true enough to be unreservedly true - the AI corrected me! so i
needed to say “Winnipeg is usually very cold in winter” - a perfectly
true statement that now teaches the AI what is perfectly true. Or, if
you want to go against the grain (the semantic in the trained data
set), b) mold it with general statements towards the direction you
want. Like the example i gave you

Q: what is sex?

1 Like

The solution is Finetuning.
Just shoving all the examples into the prompt is for initial testing. They call it a “playground” for a reason.
In a final product, your prompt should only include the unique relevant context. Any sort of boilerplate/ prompt engineering can be baked in with Finetuning.

Probably yes, I need to test it further. I did some fine-tune experiments but wasn’t happy with the results
Probably my fault due to a very low number if entries

In general is there any number of data points that one should use for fine-tuning?

To keep this in context we are experimenting with some datasets. And it took us >1000 for it to work right. More for a more complex dataset. My understanding is that you should have at-least 200-400 points for each class (label)

Depends what exactly you are going for?
Are you just trying to ensure it replies with a certain style or format? Or are you expecting it to actually recall factual data from the finetuning?