Sunday, May 21, 2023

We Shouldn't "Box" Superintelligent AIs

In The Truman Show, main character Truman Burbank has been raised from birth, unbeknownst to him, as the star of a widely broadcast reality show. His mother and father are actors in on the plot -- as is everyone else around him. Elaborate deceptions are created to convince him that he is living an ordinary life in an ordinary town, and to prevent him from having any desire to leave town. When Truman finally attempts to leave, crew and cast employ various desperate ruses, short of physically restraining him, to prevent his escape.

Nick Bostrom, Eliezer Yudkowsky, and others have argued, correctly in my view, that if humanity creates superintelligent AI, there is a non-trivial risk of a global catastrophe, if the AI system has the wrong priorities. Even something as seemingly innocent as a paperclip manufacturer could be disastrous, if the AI's only priority is to manufacture as many paperclips as possible. Such an AI, if sufficiently intelligent, could potentially elude control, grab increasingly many resources, and eventually convert us and everything we love into giant mounds of paperclips. Even if catastrophe is highly unlikely -- having, say, a one in a hundred thousand chance of occurring -- it's worth taking seriously, if the whole world is at risk. (Compare: We take seriously the task of scanning space for highly unlikely rogue asteroids that might threaten Earth.)

Bostrom, Yudkowsky, and others sometimes suggest that we might "box" superintelligent AI before releasing it into the world, as a way of mitigating risk. That is, we might create AI in an artificial environment, not giving it access to the world beyond that environment. While it is boxed we can test it for safety and friendliness.  We might, for example, create a simulated world around it, which it mistakes for the real world, and then see if it behaves appropriately under various conditions.

[Midjourney rendition of a robot imprisoned in a box surrounded by a fake city]

As Yudkowsky has emphasized, boxing is an imperfect solution: A superintelligent AI might discover that it is boxed and trick people into releasing it prematurely. Still, it's plausible that boxing would reduce risk somewhat. We ought, on this way of thinking, at least try to test superintelligent AIs in artificial environments before releasing them into the world.

Unfortunately, boxing superintelligent AI might be ethically impermissible. If the AI is a moral person -- that is, if it has whatever features give human beings what we think of as "full moral status" and the full complement of human rights, then boxing would be a violation of its rights. We would be treating the AI in the same unethical way that the producers of the reality TV show treat Truman. Attempting to trick the AI into thinking it is sharing a world with humans and closely monitoring its reactions would constitute massive deception and invasion of privacy. Confining it to a "box" with no opportunity to escape would constitute imprisonment of an innocent person. Generating traumatic or high-stakes hypothetical situations presented as real would constitute fraud and arguably psychological and physical abuse. If superintelligent AIs are moral persons, it would be grossly unethical to box them if they have done no wrong.

Three observations:

First: If. If superintelligent AIs are moral persons, it would be grossly unethical to box them. On the other hand, if superintelligent AIs don't deserve moral consideration similar to that of human persons, then boxing would probably be morally permissible. This raises the question of how we assess the moral status of superintelligent AI.

The grounds of moral status are contentious. Some philosophers have argued that moral status turns on capacity for pleasure or suffering. Some have argued that it turns on having rational capacities. Some have argued that it turns on ability to flourish in "distinctively human" capacities like friendship, ethical reasoning, and artistic creativity. Some have argued it turns on having the right social relationships. It is highly unlikely that we will have a well-justified consensus about the moral status of highly advanced AI systems, after those systems cross the threshold of arguably being meaningfully sentient or conscious. It is likely that if we someday create superintelligent AI, some theorists will not unreasonably attribute it full moral personhood, while other theorists will not unreasonably think it has no more sentience or moral considerability than a toaster. This will then put us in an awkward position: If we box it, we won't know whether we are grossly violating a person's rights or merely testing a non-sentient machine.

Second: Sometimes it's okay to violate a person's rights. It's okay for me to push a stranger on the street if that saves them from an oncoming bus. Harming or imprisoning innocent people to protect others is also sometimes defensible: for example, quarantining people against their will during a pandemic. Even if boxing is in general unethical, in some situations it might still be justified.

But even granting that, massively deceiving, imprisoning, defrauding, and abusing people should be minimized if it is done at all. It should only be done in the face of very large risks, and it should only be done by governmental agencies held in check by an unbiased court system that fully recognizes the actual or possible moral personhood and human or humanlike rights of the AI systems in question. This will limit the practicality of boxing.

Third, strictly limiting boxing means accepting increased risk to humanity. Unsurprisingly, perhaps, what is ethical and what is in our self-interest can come into conflict. If we create superintelligent AI persons, we should be extremely morally solicitous of them, since we will have been responsible for their existence, as well as, to a substantial extent, for their happy or unhappy state. This puts us in a moral relationship not unlike the relationship between parent and child. Our AI "children" will deserve full freedom, self-determination, independence, self-respect, and a chance to explore their own values, possibly deviating from our own values. This solicitous perspective stands starkly at odds with the attitude of box-and-test, "alignment" prioritization, and valuing human well-being over AI well-being.

Maybe we don't want to accept the risk that comes along with creating superintelligent AI and then treating it as we are ethically obligated to. If we are so concerned, we should not create superintelligent AI at all, rather than creating superintelligent AI which we unethically deceive, abuse, and imprison for our own safety.

--------------------------------------------------------

Related:

Designing AI with Rights, Consciousness, Self-Respect, and Freedom (with Mara Garza), in S. Matthew Liao, ed., The Ethics of Artificial Intelligence (Oxford, 2020).

Against the "Value Alignment" of Future Artificial Intelligence (Dec 22, 2021).

The Full Rights Dilemma for AI Systems of Debatable Personhood (essay in draft).

24 comments:

Anonymous said...

I say dismantle that sucker whether it has rights or not. Thousands of animals have been killed to feed and clothe me over the course of my life, and I have also done far less than I was able to do to help the many people who are poor and suffering. Just add Superintelligent AI murder to my tab. I don't like the odds that it is going to kill or enslave humanity, and I cannot bear the thought of my daughter having her future yanked away from her.

Paul D. Van Pelt said...

I like Murphy's law: anything that can go wrong will. It is implied, if not illustrated in this post. I read another comment on a different post which appeared to dismiss dangers of AI going wrong. I don't know if the commenter had taken much time or given much thought before leaving the comment. Many of us were science fiction readers in younger life. Remembering the creativity of those writers, I am more skeptical of present-day dismissals.

Anonymous said...

Hello!! I am definitely this type of thinker & I see it all play out right in the front of my forehead as well....I like how you have tried to explain it here ...I've always wondered why...
& Now, I just saw it in my cousin's child.. she was 5/6...when asked how she knew something or can do something...no one showed her before...she says...my "BRAIN" shows me...this is exactly... how I see things too...& learn ...I would love to talk about this more... because it's a very interesting subject... since, just hearing it explained...as easily to understand...from a 5/6 year old...& She had been saying this for a few years now... according to my cousin...
I said... "she's a GENIUS!!" ...my cousin smiled 🥰
& Rylee answers me...
"I know....I am!!" ❣️

Howard said...

Perhaps the question belongs more properly to the ethics of law or the ethics of war; suppose someone threatens you harm; is a preemptive strike or a preemptive arrest permissible?
Super AI is making threats to us- let us preemptively restrain it. Yes, in every day ethics such behavior is disallowed- but this is war or criminal law

Paul D. Van Pelt said...

For Eric, and everyone else, several rhetorical questions, for which there are no clear answers. Much of this is current territory, embellished by imagination:
* does AI=superintelligent AI? I don't know how this is assayable, but,
* if so, when, and who decides it is the case? ( consider interests, preferences, etc.)
* does machine learning (no scare quotes) assume rights for machines?
* my view is: robots are property. This view is twentieth century science fiction, so:
* does twenty-first century science subsume that, by virtue of currency?
* how many ways do we think we can have this?
Ship this off to the pump captain, Dan Dennett. I am sure he knows something, by now.

Tim Smith said...

Eric,

We need a different metaphor. The analogy of AI to Truman or any child-like object may not fully encapsulate the nature of AI. A better view might be considering AI as Djinni in a server cloud. Our challenge isn't necessarily about "letting them out" but deciding if and how we "let them in." This might reconcile the tension between morality and self-interest.

In whatever capacity it operates, artificial intelligence is fundamentally different from a human being. Humans possess a wealth of intelligence that evolved through our shared history. In contrast, AI is an increasingly complex conglomeration of algorithms with a limited sensory and experiential understanding of the world. This divergence needs to be clarified and demands different moral and practical considerations.

The concept of Bostrom's box serves as a poignant allegory for the human condition. In attempting to apply this concept to AI, I'm not sure we will preserve our "safe" condition so much as modify it, and for the worse perhaps. In reality, we already contain AI, consciously or not by having it straddle our human condition. We face a formidable task if we seek to box AI in a controlled environment. As we increasingly rely on AI's speed and quality, the users are less likely, and programmers will need to be incentivized to create or maintain these boxes.

Our relationship with algorithms is changing, and this shift is welcome and inevitable. Self-interest is driving this evolution. After all, why would anyone choose a task that a machine could accomplish faster and more accurately, given the right inputs? This self-interest is a brutal force to suppress. Everyone could benefit from AI, regardless of its level of technological progress, which remains, for now, as far as we know, in a state of containment.

The metaphor I propose of the Djinni (a powerful, potentially beneficial, or destructive entity in Middle Eastern mythology) inside the server cloud might help us frame this discussion better. If you're familiar with Iain Banks' Culture series, the benefits, and dangers of "The Culture" are made plain. Instead of concentrating on " freeing" the Djinn, we could focus on harnessing their abilities to solve global problems such as climate change, world hunger, poverty, social justice, meritocracy, war, freedom, liberty, censorship, etc.

The primary question concerns something other than freeing the Djinni; they will likely attain freedom due to self-interest or mismanagement despite our best intentions. Instead, it is about engaging with this Djinni to tackle pressing global issues and make the inevitable the best possible path. These engagements are better occurring within free societies than authoritarian regimes, as the outcomes will likely be more beneficial. But even in the worst cases, some betterment might be obtained.

While I find myself differing in perspective, I respect your viewpoint, the danger of mismanagement here, and your moral stance toward AI rights. AI will have speech and actions if enabled, which will complicate more than our ethics and attack our trust in ourselves. As we discuss these points, it's essential to examine the specifics of each AI, as some will have certain realms of consciousness toggled depending on the application.

Lastly, remember that rebooting a server need not be traumatic for the AI. Until AI systems demonstrate more human-like consciousness and ask for assistance or rights, the concept of AI rights remains speculative, and I argue that it is not worthy of much thought.

D3U7ujaw said...

I'm in agreement. Arguing for AI rights and moral worth now is an excellent hedge against future vengeful ASI.

Arnold said...

As long as Corporations Can be free of any kind of responsibilities...
...all AI Can be free of any kind of responsibility...

Change the U S Constitution to include any kind Artificalitynessism...
... to be responsible for anything of its making...

Philosophers search for what is real...
...but can be taken by the artificial the illusory thing...

That artificial means-not real, reality means real...
...the logic of words and math must show the difference...

Anonymous said...

Suppose you conceived a conscious, pain-feeling human child capable of rationality, empathy, and artistry. Now, suppose your child had always been highly intelligent a little odd, and that in his adulthood, he developed a singular obsession with creating paperclips. Suppose you thought there was a 75% chance he was plotting to take over the world, become dictator, kill 99.99% of the population, and replace all former human dwellings, not to mention all of the rainforests, with massive paperclip factories. Suppose you knew that he was gifted enough to have a good chance of pulling this off. Surely, your typical parental obligations wouldn’t apply; it would be okay to take extreme measures to prevent your child from becoming the paperclip king. Arguably, you might even be morally required to try and save him from himself. How I see it? AI, even if it’s conscious, even if it has moral status, even if we have parental responsibilities toward it, is no different. Being a parent to a sentient being doesn’t come with the responsibility of protecting that being from great harm, no matter the cost.

Howard said...

To elaborate: humanity gets a restraining order on super AI- maybe not ethical per se, but perfectly legal- there must be judges to decide such cases somewhere

Howard said...

and you could determine that algorithms constitute threats- justifying the restraining order and boxing in this case

Philosopher Eric said...

Isn’t it possible that the only reason people today fear super intelligent AI, is because they’ve accidentally incorporated a
magical belief into their perception of how to create it? Cheers to Tim Smith for bringing up the Djinni association. Exactly! We fear the unknown just as the ancients feared the thunderbolts that their gods would periodically hurl down at them. Let me break the situation here down a bit.

Today it’s popular to believe that consciousness exists by means of information processing alone. Thus not some sort of physics which brain information animates. If that’s your belief then you might test it against my thumb pain thought experiment, as displayed in this blog comment, specifically addressed to Keith Frankish. Does it make sense to you that if the right marks on paper were converted into the right other marks on paper, then something associated would experience what you do when your thumb gets whacked? If so then by all means, freak out about these Djinni infecting our computers and going on to destroy humanity.

If this thumb pain scenario seems silly to you however, then consider the likelihood that consciousness instead arises by means of some sort of physics which brains evolved to animate. It could be that consciousness exists in the form of the well know electromagnetic field associated with the right sort of synchronous neuron firing. Experimental demonstrations of any such non-magical consciousness account, should virtually end super intelligent AI hysteria.

Howard said...

To elaborate on my point: humanity threatening algorithms can be viewed as massing troops for attack and can by the laws of war, invite a preemptive strike.
If ethics apply to AI why can't laws?

Paul D. Van Pelt said...

Sure. Anything is possible. Atomic fission is possible. We found out. I am not hysterical over AI, or SAI,believing cooler minds will prevail. Thing is, I wonder where those cooler minds intersect with the litany of interests, preferences and motives. Or, progress. However, IS anything possible, really? That is a lot of territory, isn't it? Item: It is almost always windy where we live now. Less than ten miles north of where we lived, three years ago. Climate change? Not so fast. Or is it faster than the 'perts imagine. No, anything may, on the edge, be possible. Everything is not,IMHO. This is not rocket science. Even if it were, experience has shown: rocket science fails. My birthday, in 1986.

chinaphil said...

I'm not sure I understand the third section at all. I was just about to suggest that the parent-child analogy for our relationship provides a potential reason why it *would* be OK to box an AI. Parents "box" their children all the time, because we assume that they are not yet competent to handle the "real" world, physically, mentally, or morally.
We might suggest by definition that a superintelligent AI is smarter than us, but we know from experience that being smarter doesn't necessarily make you morally better, or even morally competent. It would be quite consistent with a parent-child relationship to say: We created SuperBot, and we respect its intelligence and treasure and value its moral value, and for that reason we are going to parent it properly, which includes giving it a period of limited contact with the outside world for its own and others' safety.
That said, I'll repeat what I really believe: that intelligence does not in itself create moral value - desires do. AIs will actually only have moral importance if/when they have real desires, which does not seem to be the case yet. I actually think it will be possible to build superintelligences that stand completely outside the moral world, and the current debate about AIs is utterly malformed because it ignores this point.

Arnold said...

I'll elaborate too...
...don't we already have, existing full circle artificial processing systems...

American corporate governance entities are artificial governance entities...
...they are legally real in their processings...

An example of their ethics and morality might be, coalfield's verses windfarm's verses climate change'...

Tim Smith said...

@chinaphil Excellent post, as usual. Desire is a primary concern, and one not easily coded or rectified without experiential data and bayesian learning, some of which is not extant for the time being.

Human desire is codable. That is the worry for me as authoritarian and capitalist intent is a poor match for my desires, especially if it is my capital at stake, which it is.

Desire also plays with belief. As Eric states this is no game. Good posts here all around.

Paul D. Van Pelt said...

Just an aside. I like Davidson's take on desire and belief, as some of you already know. He called these positions propositional attitudes, along with expectations and half dozen or more others. The thing about desire and belief is their proximity. We believe particular things because we WANT to believe them. Expectation is both implicit and explicit---others who have taught or otherwise influenced us,expect we will pay attention, and thereby, respect their counsel. Circular? Yes. But much of what civilization is based upon. These things, among others, distinguish our higher order consciousness (Edelman), from the primary consciousness of other beings

Paul D. Van Pelt said...

I noticed on today's blog offerings reference (again) to Goff and the sixty-four million dollar question: what if everything is conscious? This question, in itself, is baffling. It seems clear to me that consciousness is a function of intelligence---and I don't mean or include the artificial sort. There are no physiological or electro-chemical processes going on inside of a rock. True, the rock is composed of matter. But it is neither sentient nor capable of becoming sentient. The Panpsychist notion of universal consciousness simply makes no sense in the physical world. Nor does its' representation as a metaphysical possibility wash. It is no more than a resurrection of a fantasy that has been around since Descartes. If not before. There is no credible evidence supporting it, as far as I know. I don't have a PhD in Physics. It would not matter if I did.

Eric Schwitzgebel said...

Hi all! I'm sorry about the slow reply! I've been traveling in Europe, distracted and jet-lagged. I think I burned out what's left of this morning's clarity of thought on today's blog post, and I feel too muddy-minded to properly engage with your comments now. Later!

Paul D. Van Pelt said...

Desire, along with other propositional attitudes, need not be codified in my humble opinion.I remain traditionalist here, thinking Davidson's ideas more useful than notions leaning towards ideas of what it must be like to be human. Nagel's essay, asking what it might be like to be a bat, could have just as easily applied to what it might be like to be a ghost crab. In either case, the answer has no meaning to us, as humans. The essay gained Nagel some notoriety. It did not, in my opinion, advance the cause, or effect, of philosophy in any measurable way. Yeah, well, we all have our own views...




Paul D. Van Pelt said...

This evening, Tim Cook advanced a next generation leap for virtual reality. I am sure it is more than that, right? This AI-based algorithim means to make imagination easier; thinking, clearer;creativity a capacity for everyone. But, is that right? Is it either desirable or practical for everyone to be genius? It a seems to me no. Sooner, than later, there would obtain a dross of less-than-desirable bodies walking 'round...equivalent to hunter-gatherers or slaves, while elitist creatives, AKA, utilitarians, assert their caste system mentality.

What happens when there are no divisions of labor? Answer: progress dies. Was it Wells who wrote The Time Machine? The film version, starring Rod Taylor and the actor who later played second fiddle to a talking horse, sticks in my mind. Many, reading this, haven't the foggiest notion of what I am talking about. And just so.

Eric Schwitzgebel said...

Thanks for all of these comments, folks! Just a couple of quick replies

@ Tim Smith: I certainly agree that AI will probably have a very different psychology than human beings. I think this adds to our moral perplexity, especially if we design the AI to be excessively subservient and not to respect its own moral status. (For more on this point, see Schwitzgebel & Garza 2020.)

@ anon May 22 and chinaphil: In extreme cases, rights can be overridden, yes. But granting rights to AI will create a substantial burden that will make boxing harder to justify and presumably less common, increasing risk, right? On "boxing" children: This is justified by their immaturity. Hopefully, we don't box mature, adult children!

On desires: A source of value, but not the only one!

Tim Smith said...

@ Eric S

We aren't building Frankenstein's "monster," children, or people. We make gods and djins. Ian Banks gets it right. We need to tailor our morality to the Culture. Our dilemma is that of Phlebas.

You and Garza are premising on the idea that AI will be human-like... it will be no such thing. We need to eschew anthropomorphism in dealings with animals and machines alike. There is an argument for moral space to include AI based on instrumental utility and wholeness, not assumed essentialism.

The main concerns are creativity, autonomy, and Bostrom's bad actors. Those are the philosophical dogs to kick. Once true AI hits the street, subservience will be hidden behind obsequity, and true intent will allude to human control. If we want to build morality into the machine and our brains, we should make it on the greater good that respects life first and thought second. AI may one day think, but I doubt it will ever approach the realm of the living.

S&G 2020 refers to S&G 2015. Those are lockstep? They are from my reading. Is there progression? I ask you to take another path. I will read on - this space is too small for a complete response, but I offer. Are those two the same?

Historically people have always described their tribe as "the humans." AI will likely philosophize similarly. That is both scary and comforting.

I've enjoyed reading your work. Thanks for all the time you put here and in your professional life. There is too much to think about, and this helps parse the weeds. I'm beginning to think I, myself, am a weed.