Tuesday, February 03, 2015

How Robots and Monsters Might Break Human Moral Systems

Human moral systems are designed, or evolve and grow, with human beings in mind. So maybe it shouldn't be too surprising if they would break apart into confusion and contradiction if radically different intelligences enter the scene.

This, I think, is the common element in Scott Bakker's and Peter Hankins's insightful responses to my January posts on robot or AI rights. (All the posts also contain interesting comments threads, e.g., by Sergio Graziosi.) Scott emphasizes that our sense of blameworthiness (and other intentional concepts) seems to depend on remaining ignorant of the physical operations that make our behavior inevitable; we, or AIs, might someday lose this ignorance. Peter emphasizes that moral blame requires moral agents to have a kind of personal identity over time which robots might not possess.

My own emphasis would be this: Our moral systems, whether deontological, consequentialist, virtue ethical, or relatively untheorized and intuitive, take as a background assumption that the moral community is composed of stably distinct individuals with roughly equal cognitive and emotional capacities (with special provisions for non-human animals, human infants, and people with severe mental disabilities). If this assumption is suspended, moral thinking goes haywire.

One problem case is Robert Nozick's utility monster, a being who experiences vastly more pleasure from eating cookies than we do. On pleasure-maximizing views of morality, it seems -- unintuitively -- that we should give all our cookies to the monster. If it someday becomes possible to produce robots capable of superhuman pleasure, some moral systems might recommend that we impoverish, or even torture, ourselves for their benefit. I suspect we will continue to find this unintuitive unless we radically revise our moral beliefs.

Systems of inviolable individual rights might offer an appealing answer to such cases. But they seem vulnerable to another set of problem cases: fission/fusion monsters. (Update Feb. 4: See also Briggs & Nolan forthcoming). Fission/fusion monsters can divide into separate individuals at will (or via some external trigger) and then merge back into a single individual later, with memories from all the previous lives. (David Brin's Kiln People is a science fiction example of this.) A monster might fission into a million individuals, claiming rights for each (one vote each, one cookie from the dole), then optionally reconvene into a single highly-benefited individual later. Again, I think, our theories and intuitions start to break. One presupposition behind principles of equal rights is that we can count up rights-deserving individuals who are stable over time. Challenges could also arise from semi-separate individuals: AI systems with overlapping parts.

If genuinely conscious human-grade artificial intelligence becomes possible, I don't see why a wide variety of strange "monsters" wouldn't also become possible; and I see no reason to suppose that our existing moral intuitions and moral theories could handle such cases without radical revision. All our moral theories are, I suggest, in this sense provincial.

I'm inclined to think -- with Sergio in his comments on Peter's post -- that we should view this as a challenge and occasion for perspective rather than as a catastrophe.

[HT Norman Nason; image source]

37 comments:

Callan S. said...

Hi Eric,

One problem case is Robert Nozick's utility monster, a being who experiences vastly more pleasure from eating cookies than we do. On pleasure-maximizing views of morality, it seems -- unintuitively -- that we should give all our cookies to the monster.

I don't understand how if someone holds such a view, then that would seem non intuitive to them?

In such a case either they mean something else by saying they have that view, or are saying they have the view but don't and are lying, or say they have the view but don't and are in denial about that.

A monster might fission into a million individuals, claiming rights for each (one vote each, one cookie from the dole), then optionally reconvene into a single highly-benefited individual later. Again, I think, our theories and intuitions start to break.

Absolutely! I have to admit it's tempting to go into problem solving mode immediately (perhaps so as to make it appear there is no problem), rather than just pause and acknowledge the problem.

chinaphil said...

Yes, absolutely to all this. I went and read Nick Bostrom's book, and quickly became frustrated by two things:
1) The assumption that AI will have "desires" or "goals" in a way we can understand (mentioned this last post)
2) The assumption that the survival of the human race will remain paramount

Criticising (2) is perhaps a bit unfair - it's the premise of his book, after all. But it started to seem increasingly unsupportable. If our machine overlords become that much smarter than us, then why not just let them direct us? If they calculate that uploading to the iCloud is the best thing for us, then who are we to balk at our destiny?

Incidentally, I'm not sure it's the utility monster we should be worried about. Surely we can just set up a market system for whatever it gets utility from, and it can work 1000 times as hard for its cookies as we do? I'd be much more scared of a utility rock, one which requires a vast amount of cookies to gain even a tiny bit of utility.

But the utility monster is still a provincial thought. It assumes that the AI's moral frame of reference will be the same as, or commensurable with ours. Also worth thinking about whether radically different life forms would have radically different types of morals. What if for some AI, carbon is literally pleasure/utility? What if the machine is an even virtue mind, for which divisibility by two is a fundamental virtue?

Such an encounter might well confront us with just how empty our own morality is. It's no more than the agglomeration of thousands of viral patterns, after all.

Fred Eaker said...

Based on the treatment of animals in factory farms, we are utility monsters from their perspective.

Scott Bakker said...

Very cool. So what exactly do you mean by 'take as a background assumption,' Eric? If we understand this representationally, then it seems to suggest we need only tweak our implicit assumptions to extend moral cognition to less than stable entities.

And what do you make of the distinction between conscious/nonconscious moral cognition? Is it moral cognition itself that breaks down, or is it our moral *theorization,* which I would argue was broken to begin with (for similar reasons to why I think AI breaks moral cognition more generally)?

And if the problem lies with moral cognition, not our attempts to discursively regiment it, how, short of the posthuman (rewiring our moral systems) could this prove to be anything but disastrous?

Eric Schwitzgebel said...

Thanks for the interesting comments, folks!

Callan: I just meant it as a shorthand way to point out the following common philosophical phenomenon: You say (sincerely) that you are committed to X. But X (though you don't realize it) has implication Y, which you would, if you thought about it, be intuitively inclined reject. (Intuition = judgment derived from any process other than explicit reasoning [Gopnik & Schwitzgebel 1998].)

Eric Schwitzgebel said...

chinaphil: I find Bostrom interesting on (1) but I agree a bit frustrating on (2). Interesting on (1) because -- bracketing (for this whole discussion so far) my skepticism about the metaphysics of consciousness and thus about whether machines could really have conscious experiences -- I think that intelligent, stable systems are very likely to have something like beliefs and goals (as I suggest in my post on Matrioshka Brains).

I do think it's interesting in this context to think about how much morality might vary between different cognitive systems. One way to explore this, I think, is through speculative fiction: Can you speculatively design a coherent (society-embedded?) entity with a very different moral structure? I'm working on a couple of stories that push on this a bit, and I'm open to reading suggestions.

Eric Schwitzgebel said...

Fred: It would depend on how much pleasure we get from fried chicken vs. how much suffering the chickens experience. If it's hugely pleasurable for us and they are capable of only the dimmest most pallid sort of suffering, then something like the utility monster thing is at work -- but that's not how most consequentialists interested in food ethics see things going!

Eric Schwitzgebel said...

Scott: On your second point: I mean both conscious and unconscious moral cognition -- not *just* our attempts to regiment it. Our intuitions break. One mundane case of this, I think, is how much readier we are to have intuitive moral reactions to humanoid robots -- even now! Honda's Asimo at Disneyland is so cute! I would have a lot of trouble stabbing him through the neck.

But need the tweaking be disastrous? This goes back to your first point. My guess is that "tweaking" probably underestimates how much change would be required under certain plausible scenarios. How does one measure disaster? To take an extreme case, suppose a superhuman AI decides to dismantle the solar system, including all people, to tile it with "hedonium": whatever computational structure (probably not very humanlike) most efficiently produces pleasure? Disaster? Well, I'd be inclined to say so. But on what grounds? And what about other types of scenarios, e.g., the production of a (superhumanly happy and intelligent and creative, let's assume) Matrioshka Brain, with variant scenarios for what happens to humanity during the process (e.g., destruction, sequestration with preservation, opting in). Hard to know what to think!

Alva Noe said...


Hi Eric,

I hadn't read your stuff when I wrote this:

http://www.npr.org/blogs/13.7/2015/01/23/379322864/the-ethics-of-the-singularity

It's an important issue, I think. Plan to follow up.

-- Alva

Eric Schwitzgebel said...

Thanks for the link, Alva! I'd missed your piece. The issues seem to be in the air.

Ensuring that future AIs have our values might or might not constitute slavery, I think, depending on exactly how it's implemented. (Rational argumentation might be okay, for example.) To have the kind of concern you're expressing seems right to me -- though I also think Bostrom is right to worry about the dangers to us.

One issue in this post is whether we *should* want them to have our values, if something like the Singularity ever occurs. Maybe our values are too provincial, and depend too much on presuppositions that will by then be falsified, for it to make sense to adapt our values to those circumstances without radical revision.

Simon said...

Eric the fission/fusions monsters is another reason to approach identity and in fact life, from a systems perspective, often looking at merelogical relationships. This raises hierarchies of system integration like matryoshka dolls, Borg like entities, your “Is the United States Conscious?”and Super Super Organisms. For me this raises issues of what it is to be emergent top level modular beings that can decouple and function as extended beings. This way of thinking also helps to deal with personal identity split brain like examples.

Quickly regarding your “Is the United States Conscious?” if you know of the Red Dwarf scifi comedy series there is an episode where the main characters arrive on a space station where the main computer creates a personal identity ‘aspect’ from their combined consciousness’s. By leaving the station they ‘kill’ that entity, raising moral and identity issues.

BTW while I don’t recall the reference I know of some talk that some species can be thought of as super super organisms/systems and if you think of it we ourselves aren’t fundamentally ‘individuals’ as we can only express ourselves fully through social ‘genes’. So even individual humans could be thought of as just parts of a greater extended entity. This is even stronger when dealing with reproduction.

Lastly given so much has been made of our cognitive biases one could imagine that a lot of this come from our particular evolutionary history combined with game theory, who wins survives. One wonders at totally different evolutionary and physiological histories might throw up, but I would still imagine whatever that history many would be self-justifying.

chinaphil said...

On sci fi, I think one good example might be the Foundation series, because Hari Selden's institute has long term ethical goals which are completely unfathomable to everyone else (due to the complexity of his maths), but interact in a range of ways with the interests/ethics of those around them. More recent, and a lot of fun, is China Mieville's Embassytown, which posits aliens who cannot lie, but want to learn how.

(Incidentally, much of the problem of evil theology engages directly with this point as well, and might be very much worth looking into.)

So back to this goals thing. I agree that Bostrom makes his argument more persuasive by not relying on consciousness, but I think he's wrong to imagine that you can have real goals without consciousness. If a computer has been programmed to achieve goal X, then it follows its steps to get to goal X. Along the way it may have some routines for overcoming obstacles on the way to X, but these are still just mechanistic, pre-posited paths. Consciousness is the ability to step outside of these processes, assess them, and formulate new plans or abandon the enterprise.

The flaw I find in Bostrom's book is that he imagines a machine that is smart/conscious enough to assess its methods, but not smart enough to assess its goals. I don't see how you could have one without the other. As an example look at humans: we're smart and conscious, and we assess our methods all the time - but in addition, we also assess our ends, so much so that we sometimes decide to kill ourselves, to not have sex, to forego food...

I can understand Bostrom's point as a relative one: if we made superintelligences that had a tendency to be superviolent, they could easily kill us all in one of their wars before they managed to evolve. But I can't see it as an absolute point. I can't see why any moral end-related programming we input (even the meta-programming that he suggests) at the beginning would survive the questioning which their smart/conscious brain would direct at it.

As that brilliant piece from Alva Noe suggests, they will develop their own ethics, and we'll have to engage with them on an ethical level.

chinaphil said...

Sorry, double comment again, but I've just realised that I can invert the argument I just made. If a being can't assess and change its own goals, then it's not conscious, and it's not legitimate to call those goals "the being's goals" - they're givens, not goals.

Perhaps this is something like a Kantian argument: in order to be a being, it has to have existence separate from its goals, so that we can see *it* as an end in itself. A computer program designed to do my taxes has no identity; its identity is its purpose. It's only when we can separate the identity from the purpose that our AI can become a thing unto itself. So maybe that's the worry with Bostrom's argument. If these AIs are (clever, meta-) programs for producing our values, then they will never be moral objects, things in themselves. As with children, it's the letting go of control that allows them to become complete people.

Mobius Trip said...

It seems the past is prologue. Could Plato have already given birth to the monster ethic by presuposing a foundation to mathematics (the Sun itself in the Allegory of the Cave), thus giving rise to the intuitions of calculators? Not until Godel, the analogue of the Greek Prometheus, were the possibilities of freedom from the universe of necessity rekindled. Even worse that the human calculators are the Noble Liars with no ontological commitment. It seems the past is prologue. Could Plato have already given birth to the monster ethic by presuposing a foundation to mathematics (the Sun itself in the Allegory of the Cave), thus giving rise to the intuitions of calculators? Not until Godel, the analogue of the Greek Prometheus, were the possibilities of freedom from the universe of necessity rekindled. Even worse that the human calculators are the Noble Liars with no ontological commitment.

David said...
This comment has been removed by the author.
David said...
This comment has been removed by the author.
David said...
This comment has been removed by the author.
David said...
This comment has been removed by the author.
Eric Schwitzgebel said...

Simon: I think I agree with everything you said -- and thanks for the references! I think the necessities of survival/reproduction are going to make some types of value systems much less likely than others -- a thought I start to explore in last year's post of Matrioshka Brains, and which I'm thinking of exploring a bit more in a coming post.

David said...
This comment has been removed by the author.
David said...

Hi Erich,

I think you’re asking all the right questions here!

Some (not me) might object that our conception of a rational agent is maximally substrate neutral. It's the idea of a creature we can only understand "voluminously" by treating it as responsive to reasons. According to some (Davidson/Brandom) this requires the agent to be social and linguistic – placing such serious constraints on "posthuman possibility space" as to render your discourse moot.
Even if we demur on this, it could be argued that the idea of a rational subject as such gives us a moral handle on any agent - no matter how grotesque or squishy. This seems true of the genus "utility monster". We can acknowledge that UM’s have goods and that consequentialism allows us to cavil about the merits of sacrificing our welfare for them. Likewise, agents with nebulous boundaries will still be agents and, so the story goes, rational subjects whose ideas of the good can be addressed by any other rational subject.
So according to this Kantian/interpretationist line, there is a universal moral framework that can grok any conceivable agent, even if we have to settle details about specific values via radical interpretation or telepathy. And this just flows from the idea of a rational being.
I think the Kantian/interpretationist response is wrong-headed. But showing why is pretty hard. A line of attack I pursue concedes to Brandom-Davidson that that we have the craft to understand the agents we know about. But we have no non-normative understanding of the conditions something must satisfy to be an interpreting intentional system or an apt subject of interpretation (beyond commonplaces like heads not being full of sawdust).
So all we are left with is a suite of interpretative tricks whose limits of applicability are unknown. Far from being a transcendental condition on agency as such, it’s just a hack that might work for posthumans or aliens, or might not.
And if this is right, then there is no a future-proof moral framework for dealing with feral Robots, Cthulhoid Monsters or the like. Following First Contact, we would be forced to revise our frameworks in ways that we cannot possible have a handle on now. Posthuman ethics must proceed by way of experiment.
Or they might eat our brainz first.

Eric Schwitzgebel said...

Chinaphil: "As with children, it's the letting go of control that allows them to become complete people." Interesting thought!

I'll have to look again at the Bostrom; I don't recall that oversight in the text (and I agree it would be an oversight). I do agree that we should (both normatively and empirically) expect, that if a fully-functional general-intelligence AI is created, to engage it on its own ground and expect its (as well as ours) values to change over time in a way that eludes our full control.

Eric Schwitzgebel said...

Mobius: The comment is a bit densely packed and metaphorical for me to understand! I do agree that the past is always prologue; nothing is born in philosophy without substantial precedent.

Eric Schwitzgebel said...

David, I think I agree with most of that, especially your concluding thought:

"So all we are left with is a suite of interpretative tricks whose limits of applicability are unknown. Far from being a transcendental condition on agency as such, it’s just a hack that might work for posthumans or aliens, or might not. And if this is right, then there is no a future-proof moral framework for dealing with feral Robots, Cthulhoid Monsters or the like. Following First Contact, we would be forced to revise our frameworks in ways that we cannot possible have a handle on now. Posthuman ethics must proceed by way of experiment."

I do think we can anticipate that enduring creatures (like the "Matrioshka Brain" of one of my earlier posts) are likely to have a structure of self-preserving or at least species-preserving goals that will make them partially interpretable; and if they can manipulate elements generatively, in communication with others of their kind, according to rules or quasi-rulish-regularities then well, maybe we have language there. But I think these might be empirical likelihoods rather than strict necessities.

So the possibility of some kind of rational engagement with aliens seems not far-fetched to me, contra the darkest pessimists (like Stanislaw Lem?); but I think reflection on these types of examples suggest that we might not get as far as we might have thought without some serious rethinking.

All this I take to be a similar path to a similar conclusion as what you express above.

John Baez said...

Hi, Eric! There's a lively conversation based on this article happening here:

https://plus.google.com/u/0/117663015413546257905/posts/Kmpf8JsdxKC

Simon said...

If they have different morals can they be saved? http://gizmodo.com/when-superintelligent-ai-arrives-will-religions-try-t-1682837922/+charliejane

Simon said...

Eric here is the episode if you are interested

Red Dwarf 32 "Legion"
Chasing the vapour trail of Red Dwarf into a gas nebula, Starbug is taken over by a tractor beam which takes it to a space station. There the crew discover Legion, a highly intelligent, sophisticated and cultured lifeform conceived out of an experiment by a group of famous scientists. It is Legion who modifies Rimmer's holo-projection unit, enabling him to become a "hardlight" hologram , as a result he is able to touch, feel, eat, and experience pain – but still being made of light, cannot be physically harmed. They learn that Legion is composed from the minds of each member of the crew, combined and magnified, and as such they are sustaining his very existence with their presence. Legion will not allow them to leave and continue the search for Red Dwarf.

I'd be interested to know how you would characterize this entity?

Lauren Seiler describes humans as “poly-super-organisms” and that the system boundaries between life and non life are very indistinct.

Callan S. said...

And Rimmers idea that aliens would give him a new body was...essentially correct in the end!

I'll get me coat...lol!

Sergio Graziosi said...

Eric,
Thank you so much, I am (late and) delighted by your comments. Your post, along with Hankins and Bakker contributions have prompted me to write an attempt at making the positive case.
It's here:
Strong AI, Utilitarianism and our limited minds. All feedback, especially criticism, is more than welcome.

Katherine Tevis said...

Hello Eric,

Funny coincidence, I used to be obsessed in college with the idea that all future AIs would deserve rights or at least moral consideration.

What strikes me as odd in this whole debate is encapsulated in the following quote: "Our moral systems, whether deontological, consequentialist, virtue ethical, or relatively untheorized and intuitive, take as a background assumption that the moral community is composed of stably distinct individuals with roughly equal cognitive and emotional capacities (with special provisions for non-human animals, human infants, and people with severe mental disabilities)."

I don't think you can maintain that this "background assumption" is even relevant to intrinsic morality, unless you ditch the "special provisions." Either morality is dependent on such a background assumption; or else our "special provisions" are problematic, because they allow moral consideration that goes against our assumptions. You can't have it both ways.

I personally ascribe to a position influenced by the animal rights movement, antinatalism, and Buddhism. The only moral question is not "can they think" or "how do they think" or "are they individuals, fragments or wholes," but can they suffer. All other questions are not moral questions, they are only technical. So, in order to answer your questions about AI, fission/fusion monsters (btw, a Hindu might maintain that we are already fission/fusion monsters in a sense), we would have to consider each entity, in its particular space-time coordinates, for its ability to suffer. Pleasure, it seems to me, would not be relevant.

Eric Schwitzgebel said...

Thanks for the link, John! Yes, very interesting and lively thread over there.

Eric Schwitzgebel said...

Fun link, Simon! I've been playing around with concepts of AI divinity, in some of my stories and in posts like "Our Possible Imminent Divinity" and "Super-Cool Theodicy". Highly unorthodox, though. Or check out Eric Steinhart's recent book!

Eric Schwitzgebel said...

Simon: I will put Red Dwarf on my list. It might depend a lot on how it is implemented, which might not be revealed. I'm working on a group organism sci-fi story of my own too, right now -- will put a bit more energy into showing the details of implementation, too squeeze some of the philosophical juice.

Eric Schwitzgebel said...

Neat post, Sergio. I've left a comment over there.

Eric Schwitzgebel said...

Katherine: Thanks for that interesting comment! I think that is one among a range of reasonable approaches. For purposes of the argument, I was trying to assume that the AI would be similar in all relevant psychological respects, including capacity for suffering. To the extent the question turns on suffering, the special provisions would apply to beings in those categories *not* capable of suffering, if any. You might think that no being incapable of suffering deserves moral status -- maybe that's true (and more plausible that an equivalent thesis about higher cognition) but some people would make an except for early-stage fetuses or people (if any) with such severe brain damage that neither pleasure nor suffering is possible or (hypothetically) a being capable only of pleasure but no suffering.

Now if you mean by "suffering" "dukkha", now maybe that takes the conversation a different direction.

Sandy Ryan said...

"That is all fine and dandy but my daughter is NOT going to prom with a robot!" -Some dad, some day

Eric Schwitzgebel said...

LOL, sis. You just wait and see!