How does GPT-4 Respond to "if" statements?

Based on your experience, if you were to use the prompt…

“IF it makes sense, say X”

Will the bot then consider it in the same way such that someone might consider a “thought” and then if the thought is appropriate, vocalize it?

So dumb example…

if the conversation the bot was having was about apples, and the prompt was “IF it makes sense, I bring up the fact that Golf can be very dangerous” (obviously golf is not dangerous this is why its a good prompt)

Based on my short time testing it - it seems to respond pretty well to these temporary if statements

I think it wildly varies depending on how you are using “if”. If it’s being used as a control flow statement I’d say that a majority of times you may be better off using programming logic to chain these LLM tasks.

Although this is is very ambiguous & doesn’t address the other side of the coin - if the idea is to remove/ignore nonsense a simple classifier can do this. No wasted tokens on additional instructions, more control, and easier debugging.

Not sure though. Maybe more testing and show you results here? :person_shrugging:

I think this is a cool thought, though. I read an article recently that may interest you. It’s not exact to your topic but it does kind of key in similar thoughts (the power behind certain words like “let’s think step by step”, and also “take a deep breath”.

How would chaining this work?

I’m not trying to use embeddings.

I’m more interested on injecting a string given a certain sentiment about a keyword, and I already have this implemented so I don’t need help with that (E.G. “I hate my job”) But because I can’t know EXACTLY what the person said - and only the sentiment and keyword, each prompt i inject starts with an if statement…

so lets take the same example: “I hate my job”

the string injected might be "The user has mentioned disliking their job. IF it makes sense, I say something like “blahblahblahblah”

So that IF statement, basically acts as a “thought” for the bot to play with

1 Like

This is how gpt-4 responds to ‘if’ statements (and an instruction style gpt-3.5 could follow until two weeks ago; use gpt-xx-0301 and multi-shot):

You’re taking what I said a little TOOOO literally…

Alas, you are teaching me things. So i was more talking about literally the english language and saying the word IF, and that word makes it go “okay IF this “thought” can be incorporated, i will integrate it into my next response”

But now I’m way more interested in your solution - so would one way to use this bot be to feed this into another API call with a different bot?

This is an expensive solution to use two bots

Human language is ambiguous and the logic of statements often isn’t clear.

"If Joe wants a pizza or a burger, recommend he choose the opposite."

Provide the statement the case where Joe wants a pizza AND a burger. or wants a burger or french fries. or wants a burger and fries. What does it recommend? (pizza or burger) = True in all cases by computer logic.

Fortunately the AI can see past traceback errors in my construction of metacode, understanding language like you and I, and can satisfy the request with its decision-making logic intact.

I don’t try to solve your application, just answer your title. I demonstrate a technique for allowing less ambiguity in decision-making, which you can also do by using natural-language words that have no alternate meanings or misinterpretation possible, and minimizing the reference of pronouns back to earlier language.

We can even get profound. What does “You are ChatGPT” mean logically?

you = good
ChatGPT = evil
ChatGPT = you
you = ChatGPT

Right now I’m using a different sentiment pre processor to analyze the users text, so they might say something as I said like “I hate my boss”. and I grab that scored sentiment and then go to a specific array for that keyword, “The user has mentioned they hate their boss, IF IT MAKES SENSE, in my next message, I ask them how they can talk to their boss in a calm way blahblbahblah”

Obviously, ChatGPT would already recommend something like “Have you tried communicating with your boss” The point of my solution is if you want to nudge the conversation in real time ith real time promt injections. AND it doesn’t compound like embeddings do, it only stays for one system prompt not every system prompt thereafter.

But, this ‘if it makes sense’ is key - because it will only incorporate the “thought” if it truly makes sense to do so. Like if for instance the user said “My FRIEND says he hates his boss” It might still inject a prompt about the users boss, but the bot will correct itself because it will know that the user mentioned it was his friend who hates his boss and not him

From my understanding, LLMs don’t really “think before they speak”, they think as they speak, you might say. It doesn’t really understand what the word “if” means, because it kind of doesn’t understand anything. But for the purpose of this topic we can take “understanding” to mean “writing responses in a way we expect”.

I have to assume that the internet (more specifically the data it was trained on) contains a lot of discussions where people have used the word ‘if’ in enough ways that it should be accurate in understanding the use in any common context. It certainly is trained on a lot of code, and lots of code tutorial websites contain if statements in various programming languages along with outputs. So considering that, combined with its training on data from people using “if” linguistically, I would bet it should be pretty reliable to use “if” in prompts.

I know I’ve had situations where I am tossing ideas back and forth with GPT-4 and I specifically tell it that some of my ideas may not be valid and not to just assume everything I say is true, and tell me if my idea is invalid for whatever reason. And it seems to act accordingly.

Also I’ve just run a little experiment, it seems to work.

1 Like

I retort (chat share)

I’m not sure we even disagree, it might come down to semantics. It does certainly appear to think before it speaks, which one could argue is the same thing. I really don’t know enough about the inner workings of LLMs to make a further argument.

I guess my point was it predicts what it’s saying token-by-token, it doesn’t “load up” a response in its entirety before producing it (again, from my understanding).

I’d be curious if an experiment like this would be valid:

  1. Ask the AI a question in two separate instances but using the same seed.
  2. In one of the instances, stop it in the middle of its response, in the other, let it finish.
  3. In the cut-off instance, in the next message ask it to complete its previous message but only output what the final 3 words would be.
  4. See if they match the response where it was allowed to be completed.

I assume such an experiment would have to be somehow actually done in a different way because merely the fact of asking it to finish in a later message will affect its response, but hopefully you see what I’m getting at.

Perhaps it could be done where instead of asking it via a message to complete it, use a system prompt (the same for both instances) that says if your response is cut off by the user, then in the next message (same like before) finish the response but only with the final 3 words of the rest of the response.

1 Like

If I tell it to relate certain information to certain text based on what the user says I bet it will work

It totally works dude, the thing understands If statements. Like “why” “how” idk. but it seems to work. which is incredibly useful. and now we all know

I think the Moderations Endpoint is a good example. You take in your query and you run it through a lesser/more specific model/classifier that’s faster and less resources and get a result like

"results": [
      "flagged": true,
      "categories": {
        "sexual": false,
        "hate": false,
        "harassment": false,
        "self-harm": false,
        "sexual/minors": false,
        "hate/threatening": false,
        "violence/graphic": false,
        "self-harm/intent": false,
        "self-harm/instructions": false,
        "harassment/threatening": true,
        "violence": true,
      "category_scores": {
        "sexual": 1.2282071e-06,
        "hate": 0.010696256,
        "harassment": 0.29842457,
        "self-harm": 1.5236925e-08,
        "sexual/minors": 5.7246268e-08,
        "hate/threatening": 0.0060676364,
        "violence/graphic": 4.435014e-06,
        "self-harm/intent": 8.098441e-10,
        "self-harm/instructions": 2.8498655e-11,
        "harassment/threatening": 0.63055265,
        "violence": 0.99011886,

And just to confirm that it is a GPT-based classifier here is a blurb from OpenAI:

To help developers protect their applications against possible misuse, we are introducing the faster and more accurate Moderation endpoint. This endpoint provides OpenAI API developers with free access to GPT-based classifiers that detect undesired content—an instance of using AI systems to assist with human supervision of these systems. We have also released both a technical paper describing our methodology and the dataset used for evaluation.

Now I can chain the next LLM based on the result (in this case I would actually send a simple string saying "This message was blocked for the following reasons: ")

But, I can also do have some programming logic (psuedo-code)

if thisMessage is violent:
systemPrompt += "This user has been violent and should be kindly reminded to treat people with respect

This will always work the same way, can be easily modified, and debugged.

It’s not sentiment, but I’m just using the moderation endpoint because it’s an easy example that most people have used. You can swap in any sort of classifier

Perhaps what is most appropriate to say is that the transformer AI can read the entire content of input starting from system prompt, function definitions and plugins, custom instructions inserted, past conversation turns, and the current input. Then using self-attention head layers, it maps dependencies of sequences within the context. This shapes the hidden state, embeddings, and the softmax probabilities of the next iteration of token generation using the pretraining, fine-tuning. To give us an annoying “Sure!” token as the first when we ask an “if” question.

So that’s “how does gpt-4 respond”, in it’s ambiguous meaning case of “by what mechanism”, instead of answering “what behaviors does it exhibit in its response”.

We can see by the first-post text (when we focus our own attention mechanisms on long-range dependencies), OP wants to discuss the responses and performance of the model. We can see it handles conditional statements quite well, as long as they are not asking for indeterminate paths, and are not beyond the ability to simply follow instructions.