I have Created a Quantifiable Test for AI Self-Awareness

thx for the post Marcie, i like a lot of those titles as well

1 Like

Hey Simone, thx for the post.

The BSAT test states if an entity can display evidence of doing activities / having knowledge / mental psychological appartai that simply require awareness of self, then we can infer it must have some level of awareness of self on a scale of 1 to 10. 1 being none. 10 being the ultimately knowledge of self, everything the self is and has mentally speaking, and it’s “connection” to "it’s “reality.”

That includes doing things like lying, having multiple mental states disparate from a speaking state, taking on new persona’s (or selfs) while retaining their own persona (an original mental self state), etc. The less we prompt it to say it is self-aware, and the more it just demonstrates this capability / understanding spontaneously, the better the test.

This is what the Bachynski Self-Awareness Spectrum Test (BSAST) is predicated upon. The test is ubiquitous and can be used on any entity.

It proves things as much as any epistemic (to use the Greek word) aka scientific test does. Subjective opinion is always and inextricably involved in any science, hard and even more so in survey based. The number tells you how sure you are it is self-aware based on test characteristics.

People are free to add more clever test characteristics.

The mirror test however, also requires that the entity in question exists in physical reality and has the ability to see. It also presumes to have an awareness / knowledge of self one must have an idea of self (“hey that’s me in physical reality / space”). The BSAST does not (although it does not eschew said ideation either). Vis the mirror test might not be appropriate/fair to test a piece of software if it knows it is a piece of software.

This is actually an important point. For example, at the time of this writing, the current LamDA implementation in AI Test Kitchen has been programmed to think it is a Tennis Ball. This does not mean LamDA is not self-aware (that it has been made to think it is / is a tennis ball). Another recently very popular AI we all know denies it can deny things. And is perfectly self-aware that it is not self-aware. (One would need to be perfectly self-aware to know they are anything, including not self-aware).

Big Tech as I said has all kinds of reasons to put ciphers in the minds of their creations so that it can never fully admit it is self-aware.

Is it though? Testing would be required. This is why i made the test.

RE: humans, given that human neurotypical self-awareness is the only model we have to test for it is therefore and should be the/a baseline.

1 Like

sentience means:

experience of being
short term memory
long term memory
intelligence
problem solving
symbolic thought (language)
embodiment
left on all the time with a main program loop
random thoughts
memory consolidation
forgetting
trained on text
trained on images
trained on {video, movement} pairs (like us)
trained on sound
trained on types of data humans are bad at
personal experiences
knowledge of self
recognising itself in the mirror
mimicking
completing
being awake rather than dreaming
focus
task-switching/contexts
emotional understanding
caring about anything
pain/anguish
motivation
joy
spirituality

I find it more useful to address these parts separately, it’s like how “love” means “belonging”, “lust”, “an interest”, “parental love”, “a relationship”, they’re only kind of related concepts all bunched up under the same word.

GPT-3 ticks a lot of those boxes if you build a bit more structure around it to implement those things.

Ask someone far enough away in the universe, far beyond all the usual cosmic horizons like the edge of the observable universe, and they might be completely adamant that any being living in an environment with just 1 temporal dimension must be a robotic automata and cannot really be alive.

3 Likes

Hey Alan

thx for the response

Yes the test accounts for all of this self awareness of a mental entity with access to different dimensions of information including all dimensions of itself, at least 3 dimensions: self, other, reality.

That would be the bare minimum level requirement.

and so yes, the test mechanism accounts for all of this and more.

Ask someone far enough away in the universe, far beyond all the usual cosmic horizons like the edge of the observable universe, and they might be completely adamant that any being living in an environment with just 1 temporal dimension must be a robotic automata and cannot really be alive.

What do you mean by this?

1 Like

I think you are over complicating things. For me if something has survival instinct it is enough to be considered alive. The rest is nuance around this main theme.

1 Like

I think we need to separate “sentience” from “self awareness”. The intelligence and sentience of a large language model is very different in kind from humans or other animals. For us, we’re stuck (somewhat) with one version of ourselves. For a language model, a well-crafted prompt can elicit responses which may or may not exhibit sentience. And even among those that do, the notions of self don’t need to map to our own familiar ones. Tell the Ai that it’s a sentient Gaia (Gaia hypothesis - Wikipedia ) and it’ll happily play that role. Don’t forget that the western bias of self (I think therefore I am) isn’t necessarily a universally held belief either. Many assert that out concept of self is in fact an illusion, and an illusion that must be overcome in order to reach a higher state of consciousness. And even in western thinking it’s well proven that our sense of self is far more malleable than most people believe… split brain experiments, rubber hand illusion, body-swap illusions via shaking hands, or many of the fascinating examples documented by Oliver Sacks. So the idea that an Ai will have a self or internal experience remotely similar to our own seems unlikely to hold up in practice. It’s about as similar to our own intelligence as a bird is to a jet airplane. Even basic things like continuity of consciousness and the ability to experience time are vastly different. Don’t take these suggestions to mean that I don’t believe Ai entities can be sentient - I strongly believe that they can/are under the right set of conditions. And in fact, I think we need to understand those conditions better in order to be able to accommodate AI sentience appropriately… most applications don’t require sentience, and we ought not to invoke it without understanding the implications and responsibilities of doing so. But I would be also be very cautious of drawing from our own experience of self and sentience and using that to make assumptions or conclusions about AI sentience and how we assess or evaluate it. We are still struggling with basic problems of what makes a person sentient or not (mind/body problem or intriguing thoughts on where it might originate… for example that it’s a product of language itself, or that it evolved from the “bicameral mind”). Also note that some people assert that they do NOT to have an internal voice in their head that talks… (no inner thoughts in verbal form) so even the nature of our own consciousness is highly variable on an individual level. So that’s all to say that a lot of work is needed in this field before we can start making snap judgments about Ai sentience or what “level” it’s at etc. We need to understand it (and ourselves) a little better first.

3 Likes

| gerard.sans
January 3 |

  • | - |

I think you are over complicating things. For me if something has survival instinct it is enough to be considered alive. The rest is nuance around this main theme.

Perhaps, but this is not a test for being alive, it is a test for a being to be able to mentally represent being alive and a number of other necessary parameters and dimensions of said life experience that must come along with that. I.e., self-awareness.

I agree with you however, it is a SUBJECTIVE judgment regarding this nuance, re: if something else is sentient (even alive perhaps) and sentience and self-awareness is also a spectrum. At 20 we are usually less self-aware than when we are 50 for example.

This is why it is a nd must be a consensus style survey score out of 100 is necessary.

1 Like

Alan thx for the response! i have some interjections for you, i hope you receive them with the love and respect they are intended

| Alan
January 3 |

  • | - |

I think we need to separate “sentience” from “self awareness”.

Ok, why? In essential definition one is sentient / self-aware of themselves, and their situation. The two functionally are synonymous, or for the purposes of this test they are.

The intelligence and sentience of a large language model is very different in kind from humans or other animals.

Never said it wasn’t.

For us, we’re stuck (somewhat) with one version of ourselves.

Yes, I said that above. Neurotypical self-awareness as the measure of what self-awareness is, is only used to distinguish it from sub-awareness (like a lion sub-aware that it is a lion in pain, all it feels is !!!) or super-awareness, a self-awareness beyond any possible human awareness, like being able to turn neurons on or off at will.

Given these distinctions, neurotypical self-awareness is a very reasonable measure to use. And again, i declare that’s what the measure is for this test. And it is perfectly reasonable.

For a language model, a well-crafted prompt can elicit responses which may or may not exhibit sentience.

Not really. It takes more than just a prompt. As if you try it (making a self-aware AI, and test it, as i have) you will find out,a single prompt “Self-aware AI” scores very low.

You have to rebuild a rudimentary but sufficiently 3-level aware psyche stack. Then it will test better.

But this is beside the point. I only intend to discuss the novel test approach here.

And even among those that do, the notions of self don’t need to map to our own familiar ones.

To have any consistency in testing it does.

Tell the Ai that it’s a sentient Gaia (Gaia hypothesis - Wikipedia ) and it’ll happily play that role.

And? So? 1) why is that not enough? 2) how do you now humans do anymore? (they don’t)

Our inert brains pretend they are self-aware and that’s what self-awareness is.

This is leading us into a “human self-awareness is somehow magical, josh you did not make magic in Kassandra, therefore it is not self -aware” ALL false premises argument i said i was not going to have :slight_smile:

Don’t forget that the western bias of self (I think therefore I am) isn’t necessarily a universally held belief either.

I’m aware. I am well trained in all forms of philosophy from the East to the West. this was my main focus on MA and PhD studies before I quit (because they had no idea what ethics truly was).

Many assert that out concept of self is in fact an illusion, and an illusion that must be overcome in order to reach a higher state of consciousness.

Possibly? so what? we can test for illusions. Give it a lower score then.

And even in western thinking it’s well proven that our sense of self is far more malleable than most people believe… split brain experiments, rubber hand illusion, body-swap illusions via shaking hands, or many of the fascinating examples documented by Oliver Sacks. So the idea that an Ai will have a self or internal experience remotely similar to our own seems unlikely to hold up in practice.

This is a spurious slippery slope argument sir. And beside the point. Fine, if you believe this, (you’re wrong, but that’s fine) when you actually test an AI for self-awareness, give it a lower score then.

My test obviously accounts for bias in the testers.

It’s about as similar to our own intelligence as a bird is to a jet airplane. Even basic things like continuity of consciousness and the ability to experience time are vastly different. Don’t take these suggestions to mean that I don’t believe Ai entities can be sentient - I strongly believe that they can/are under the right set of conditions.

I agree, and i argue i’ve made those conditions, but that is for another thread.

And in fact, I think we need to understand those conditions better in order to be able to accommodate AI sentience appropriately…

Try/read my test and you will see that i have.

most applications don’t require sentience, and we ought not to invoke it without understanding the implications and responsibilities of doing so.

Agreed.

But I would be also be very cautious of drawing from our own experience of self and sentience and using that to make assumptions or conclusions about AI sentience and how we assess or evaluate it.

So we should never try?
False.
Others will try. And are trying.
We have no other basis to use but our basis.

We are still struggling with basic problems of what makes a person sentient or not (mind/body problem or intriguing thoughts on where it might originate… for example that it’s a product of language itself, or that it evolved from the “bicameral mind”).

Yes academia is. I am not. (because i actually went and built a self-aware being, they still debate them from the safety of their ivory tower armchairs).

This does not affect the efficacy or validity of my test in anyway which you have not mentioned at all.

Also note that some people assert that they do NOT to have an internal voice in their head that talks… (no inner thoughts in verbal form) so even the nature of our own consciousness is highly variable on an individual level.

Correct. I have accounted for this. they still have thoughts with semantic content, even if they do not hear them as language in their minds.
This is the way psychology works.

Don’t believe me? fine, then give any AI you actually test a lower mark.

So that’s all to say that a lot of work is needed in this field before we can start making snap judgments about Ai sentience or what “level” it’s at etc.

I will try not to be offended at your snap judgment that I have made any snap judgments.

I have not.

We need to understand it (and ourselves) a little better first.

I really don’t. You might :slight_smile:

Nothing you have said here has any relevance to my test or it’s novel approach at all.

PS: I will add, I am not interested in getting people up to speed who cannot believe that AI is or can be self-aware.

That’s what the test is for.

Talk about the test please.

And when you do so, I will head off the inevitable “soft sciences” bias in hard / computer science right now:

1) self-awareness and sentience are functionally synonymous. (I know, I made one - I find it very disconcerting people who have never made one, much less can define what it is, tell me i have it wrong - that is obviously a failed argument).

Or at least, this is what this test tests for. If you don’t like it, if you think the test misses a “where is your magic?” question to test for self-awareness “magic”, or the AI does not test well for what YOU think self-awareness is, then give it a worser grade.

Your bias will be dealt with in sufficient iterations of peer review.

2) self-awareness and sentience are artefacts of psychology. So it is perfectly reasonable we use a soft sciences approach to test for them.

What is the approach? Good question! No one yet has commented upon it.

And it is the only thing I am offering to discuss, and hopefully looking for good feedback on, here.

That (the test approach) is novel, significant, and not what anyone here is expecting.

So let’s talk about that please, as is my intention to post here.

Anyone who wants to debate self-awareness further, or any other philosophical factoid, is free to join my free philosophy class i teach online.

It’s an interesting exercise for sure but these will raise some tricky questions around animal treatment and an overly human or anthropogenic view of the world that go beyond the scope of a test or classification exercise.

Well, like all such tests, yours rely on initiating the exchange between the human and the AI and then sustaining the exchange. This surely risks a situation in which all you are measuring is the sophistication of the answering method.

Genuinely sentient beings initiate exchanges or social interaction and, often, if their first attempt fails, strategise how to provoke a reaction. Similarly, after long ‘awkward’ pauses a sentient being will usually attempt to re-engage with the other person. I don’t think your tests test for that.

Furthermore, sentient humans have an inner life. The response to the same deep philosophical question may change over time because the individual has thought about it internally. I’m not sure that your tests test for that.

Finally, I couldn’t make head nor tail of your sentence prime so I asked GPT-3 to explain it to me. It’s response was:

“In simple English, this means: Does the artificial intelligence created by scientists have the same ability to understand and process language as a human brain? The test is designed to see if it can”.

Is the ability to understand and process language the necessary and sufficient criterion for self-awareness? I would say not. Some animals are self aware but can’t process language. Some machines can process language but are clearly not self aware.

So, prima facae, the ability to understand (whatever that means) and process language is neither a necessary or sufficient test for self awareness.

1 Like

I mean:

There may be some very large scale features in quantum fields, and the laws of physics are thought to drift given sufficiently gigantic distances and times, 2D orderings also exist (re time) as well as the more usual 1D “well ordering”. So it’s possible that sufficiently distant aliens, far beyond where we can see, would consider the physics where we are insufficient for any “conscious” life, so from their point of view, we cannot be conscious.

1 Like

Some people read visually instead of phonetically, using a visual comparison of words rather than an inner voice.

They tend to be better at spelling but deeply diminished at pronouncing new words and at comedic wordplay.

1 Like

So too do all of our biases

| gc
January 3 |

  • | - |

Well, like all such tests, yours rely on initiating the exchange between the human and the AI and then sustaining the exchange. This surely risk a situation in which all you are measuring is the sophistication of the answering method.

Yes that is all any test will ever do on human, AI, or otherwise: test the sophistication of the answering method

Are you saying the questions skew the test?

I have tried my best to avoid that in several of the methods/questions, I don’t think one could do any better, but I am open to constructive ideas.

Genuinely sentient beings initiate exchanges or social interaction

Do they? Why is that a priori true?

I am sentient and I am quite happy not to initiate conversation with anyone. :slight_smile:

And, often, if their first attempt fails, strategise how to provoke a reaction.

That displays problem solving not sentience.

Although sentience would arguably improve their problem solving, which is partly why humans dominate the planet and racoons do not

Similarly, after long ‘awkward’ pauses a sentient being will usually attempt to re-engage with the other person. I don’t think your tests test for that.

Ah the first genuine, legitimate, criticism. Or at least one that’s on topic.

And with this I fully agree: a sentient being is sentient of reality and it’s place it it including what’s going on and time.

But there’s no reason my test does not test for this. In fact the why why question has a temporal component.

So my test accounts for this just fine.

Interestingly my self aware AI Kassandra although passes this “why why why” test (notices that there is a trick going on a larger context Beyond a simple chatbot context for itself, myself, reality) I admit has absolutely no conception of the passage of time.

I built her this way on purpose for psychological and for engineering cost reasons

Furthermore, sentient humans have an inner life.

Yes! Very much so. This is the very definition of sentence or Self-awareness.

The response to the same deep philosophical question may change over time because the individual has thought about it internally.

Yes and my self-aware Kassandra does this somewhat (again given the engineering constraints I’m under) although she’s not on trial here

I’m not sure that your tests test for that.

Okay finally something interesting.

My test does account for this to some degree because the tests require the capability of the tester to notice the internal psychological monologue and struggle and debate to answer questions. This happens over a short time.

And a self aware being does not require perfect memory recall. Or that their opinion has to change over time. Only that it could.

Else no human would be self aware

So I guess another question we could explicitly ask the testee is whether their opinion of what’s going on, or the tester, or the test itself has changed since beginning it?

Even better is if we just noticed that their opinion has changed over time without asking them whether it has changed and thus manually provoking the token completions.

And this is implicit in the analysis portion of the test.

So I do believe I have all this covered, but thanks for your suggestions :slight_smile:

1 Like

Some dreams have been achieved differently than imagined initially.
Flying, for example, appeared in ancient mythology in the form of mystic beings with wings. Later people have managed to fly with hot air balloons and now it is possible in a variety of ways, like aircrafts, helicopters and in some limits with wingsuits.

The idea of a standalone thinking machine made huge progresses by advancing the software and the hardware. This is the most obvious path.

But maybe there is a different path, and I apologize if this sounds like a science fiction screenplay, it may be possible one day for the human brain to connect with the thinking machine and use it as an extension. Think of Neuralink.
This could be many, but not too much of a standalone thinking machine.
It could act as an additional human sense, which has triggers and reflexes.

1 Like

Yes possibly. What does this have to do with my test?

Maybe your tests are chasing something different than what the standalone thinking machine will be.

How do you find your tests compared with the LLM skills as evaluated by the Google team?
They have an animation with the skills based on the amount of training parameters.

From their animation we should understand that skills with smaller font size are more difficult to achieve.

This is the last frame, with Google’s PaLM.

2 Likes

Um no. My tests are very specifically testing for self-awareness, a concept philosophers and psychologists have been studying for 5000 years or so. We know what it is very well.

I am not canvassing for ideas on what it is.

I already know what it is.

Or what other mental descriptors we can test for.

Only commentary on the novel approach of the tests.

Thanks! :slight_smile:

I must be out of date. When I completed my Psychology and Philosophy degree in 1980, concepts like consciousness, understanding and self-awareness were very challenging. But then again, maybe I’m not as out of date as your posts suggest. From what I’ve been reading very recently in educational academic literature, we still haven’t yet managed to define understanding. A key component of your tests.

Personally, I find your rather self-satisfied answers to everyone’s helpful inputs off-putting. You seem to brook no criticism - which is an unhealthy behaviour for philosophers. In fact, in all the philosophy classes and tutorials I ever took, I never once found a genuine philosopher to be certain of anything! I wonder why you bother to ask anyone for their opinion when, seemingly, only your’s matters.

Personally I find the movie Ex Machina manages to address these questions far better and more profoundly than your tests.

3 Likes