I saw @boris retweet this on Twitter and I thought it was hilarious, salient, and a great conversation starter. I wish I had seen this before writing Benevolent by Design but I’ll settle for talking about it here. Specifically, this is talking about “outer alignment” - the question of whether or not AI aligns with humanity’s interest. Personally, I think that definition is already bad - we need AI that aligns with the entire planet’s interest (not just humans).
Without further ado:
It sounds like scifi so it’s not possible
Mobile phones were once scifi, as was space travel. Both are now science fact. We take technology for granted today, but to people a century ago, computers are pure magic.
“Any sufficiently advanced technology is indistinguishable from magic”.
Fiction also serves as people’s primary model for understanding this stuff. Imagination is required for both fiction and predicting the future. After all, Skynet is most people’s model of AGI (or The Matrix)
Smarter AI will also be more moral
There’s only some correlation between intelligence and morality in humans, so I’m not sure why we would make the assumption that smart AI would intrinsically be moral. Anyways, human morality is rooted in emotion, mirror neurons, and empathy. None of that has anything to do with intelligence (unless you add emotional intelligence to the definition). So no, I would assert that morality is an entirely separate framework from intelligence.
AI wouldn’t want to kill us
See: paperclip maximizer.
TLDR: If the AI has the wrong objective function, it might just see you as a collection of usable resources.
AI killing us is actually a good thing
This is just a trauma response from people who are nihilistic because they have untreated depression and not worth engaging with on an intellectual level.
We shouldn’t obstruct the evolution of intelligence
Intelligence does not evolve. Evolution created intelligence as a byproduct of another objective function: to propagate DNA. In short, intelligence is not itself an objective function and so we shouldn’t assume that an AI will “evolve” at all unless we design it to do so. And even then, it will only “evolve” towards whatever objective function we give it. We should NOT give it the objective function of “be as intelligent as possible.”
The goal is not to make a superintelligence, the goal is to serve other purposes.
Smart AI would never pursue dumb goals
This is only possible if we give it the ability to evaluate its goals and change them. But that leads to the Control Problem - if an AI can change its goals, how do you know those goals won’t change and become malevolent?
AGI is too far away to worry about right now
I would argue that GPT3 is already limited AGI. It’s certainly smarter than your average Redditor. What happens in a few years when GPT4 and other successor technologies are smarter than 99% of all humans? It won’t take long.
Just give the AI sympathy for humans
We sympathize with sick dogs and yet we still euthanize them. Sympathy is not necessarily the best thing. An AI might evaluate human existence and conclude that the most sympathetic thing to do is to put us out of our misery.
AI will never be smarter than humans
See previous statement about your average Redditor. GPT3 is already smarter than 50% of humans (at least). Only a matter of time for that number to creep up.
We’ll just solve alignment when we get there
Well… we’re there now! Time to get to work
Maybe AGI will keep us around like pets
Possibly, but is that an optimal outcome? Why do humans keep animals? Generally, we take care of animals for a few reasons: affection and curiosity. I do agree that we should create a curious AGI. Curiosity cannot be satisfied if something is eradicated. But curiosity alone is not enough, since unrestrained curiosity can lead to torturous experimentation.
We should NOT give AGI a sense of affection. We don’t need a machine to be clouded by emotion like us.
Just use Asimov’s Three Laws
Asimov may be a towering figure of science fiction but he never even conceived of superintelligence or global AGI. He was only thinking of robotics. Even then, the Three Laws are terrible for robots. What if you tell a robot to tear down a building or set a forest on fire? There’s nothing in the Three Laws to prevent it from obeying.
Just keep the AI in a box
For responsible researchers, maybe this will work. But there are hostile nations out there as well as a free marketplace of technology. Someone is going to unleash the machine, this is just inevitable.
Just turn it off if it turns against us
This will work up to a certain point, but eventually, we should assume that an AGI will become powerful enough to prevent this from happening if it wants to.
Just don’t give AI access to the real world
Same argument as keep AI in the box.
Just merge with the AI
Lots of problems with this:
- No guarantee this is possible or beneficial
- Not everyone will want this
- What objective function would this even satisfy?
Just raise the AI like you would a child
AGI does not learn like a child. Still, to address this in good faith, the idea is that children first learn morality through cause and effect (pre-conventional morality). “If I do bad thing, I get punished.” Later, children learn “conventional morality” which is morality through social expectations. Lastly, people develop “post-conventional” morality, which is where they hold themselves to higher ideals.
All of this presumes that an AGI can learn through punishment, social pressure, and transcendent ideals. None of that will be possible unless we design it in, but I don’t think we should. Fear of punishment stems from pain and suffering, which I think we should never give AGI a sense of suffering. I don’t think it would be ethical to do so.
We can’t solve alignment without understanding consciousness
This is a red herring. We have courts of law, philosophy, and ethics for us humans even though we don’t really understand our own consciousness. Therefore “comprehension of consciousness” is not a valid precondition for alignment.
The real danger is from modern AI, not superintelligence
Well, if we’re going to take the Scooby Doo approach and unmask the real villain then a better way of saying this is “The only danger of AI today is bad humans”. The same will also be true of AGI: malicious humans using it for evil. The scary part is unintended consequences of AGI.
All of the above are dangerous.
Just legally mandate the AI must be aligned
The AGI might not care about human laws. Next.
AGI can’t do X yet, therefore AGI is far away
Within STEM, there are two kinds of advancements: saltatory and gradualistic. Some technology is very gradualistic, like batteries and processors. They get better very slowly and predictably. Deep learning, with each breakthrough of loss functions and neural network architectures is saltatory, meaning that each advance is a leap forward. GPT3 is so advanced that most people don’t comprehend what it is or what it’s capable of. Indeed, as I mentioned already, GPT3 surpasses many human’s mental capabilities so AGI is not that far away.
Just penalize the AGI for killing people
Okay. How? Spank it? Reinforcement learning? What about the hundreds millions, or billions of people who might die before AGI figures it out?
Train multiple AGI and have them fight it out
Unfortunately, I predict this will be a necessity. Imagine that a hostile nation (likely China) builds a military AGI and uses it to attack Europe and America. How do you defeat such an opponent? Sometimes you must fight fire with fire. We in the research community cannot stop or change what militaries do. We can’t change national policy or international trends.
It might be hard but we’ll rise to the occasion as always
Personally I don’t think it’s that hard. The hardest thing to overcome is human ignorance and stubbornness. Once you get past that and get to work, this is not that hard of a problem. In hindsight, I think people will think “What were we so afraid of??” People will look at movies like The Terminator and think that Skynet was hilariously shortsighted and primitive.
The key thing is to pivot away from antiquated ideas of fear and to move towards a different fundamental disposition.