Tuesday, March 21, 2023

The Emotional Alignment Design Policy

I've been writing a lot recently about what Mara Garza and I, since 2015, have been calling the Design Policy of the Excluded Middle: Don't create AI systems of disputable moral status. Doing so, one courts the risk of either underattributing or overattributing rights to the systems, and both directions of error are likely to have serious moral costs.

(Violations of the Design Policy of the Excluded Middle are especially troubling when some well-informed experts reasonably hold that the AI systems are far below having humanlike moral standing and other well-informed experts reasonably hold that the AI systems deserve moral consideration similar to that of humans. The policy comes in various strengths in terms of (a.) how wide a range of uncertainty to tolerate, and (b.) how high a bar is required for legitimate disputability. More on this in a future post, I hope.)

Today, I want to highlight another design policy Garza and I advocated in 2015: The Emotional Alignment Design Policy.

Design AI systems so that ordinary users have emotional reactions appropriate to the systems' genuine moral status.

Joanna Bryson articulates one half of this design policy in her well-known (and in my view unfortunately titled) article "Robots Should Be Slaves". According to Bryson, robots -- and AI systems in general -- are disposable tools and should be treated as such. User interfaces that encourage people to think of AI systems as anything more than disposable tools -- for example, as real companions, capable of genuine pleasure or suffering -- should be discouraged. We don't want ordinary people fooled into thinking it would be morally wrong to delete their AI "friend". And we don't want people sacrificing real human interests for what are basically complicated toasters.

Now to be clear, I think tools -- and even rocks -- can and should be valued. There's something a bit gratingly consumerist about the phrase "disposable tools" that I am inclined to use here. But I do want to highlight the difference in the type of moral status possessed, say, by a beautiful automobile versus that possessed by a human, cat, or even maybe garden snail.

The other half of the Emotional Alignment Design Policy, which goes beyond Bryson, is this: If we do someday create AI entities with real moral considerability similar to non-human animals or similar to humans, we should design them so that ordinary users will emotionally react to them in a way that is appropriate to their moral status. Don't design a human-grade AI capable of real pain and suffering, with human-like goals, rationality, and thoughts of the future, and put it in a bland box that people would be inclined to casually reformat. And if the AI warrants an intermediate level of concern -- similar, say, to a pet cat -- then give it an interface that encourages users to give it that amount of concern and no more.

I have two complementary concerns here.

One -- the nearer-term concern -- is that tech companies will be motivated to create AI systems that users emotionally attach to. Consider, for example, Replika, advertised as "the world's best AI friend". You can design an avatar for the Replika chat-bot, give it a name, and buy it clothes. You can continue conversations with it over the course of days, months, even years, and it will remember aspects of your previous interactions. Ordinary users sometimes report falling in love with their Replika. With a paid subscription, you can get Replika to send you "spicy" selfies, and it's not too hard to coax into erotic chat. (This feature was apparently toned down in February after word got out that children were having "adult" conversations with Replika.)

Now I'm inclined to doubt that ordinary users will fall in love with the current version of Replika in a way that is importantly different from how a child might love a teddy bear or a vintage automobile enthusiast might love their 1920 Model T. We know to leave these things behind in a real emergency. Reformatting or discontinuing Replika might be upsetting to people who are attached, but I don't think ordinary users would regard it as the moral equivalent of murder.

My worry is that it might not take too many more steps of technological improvement before ordinary users can become confused and can come to form emotional connections that are inappropriate to the type of thing that AI currently is. If we put our best chatbot in an attractive, furry pet-like body, give it voice-to-text and text-to-speech interfaces so that you can talk to it orally, give it an emotionally expressive face and tone of voice, give it long-term memory of previous interactions as context for new interactions -- well, then maybe users do really start to fall more seriously in love or at least treat it as being as having the moral standing of a pet mammal. This might be so even with technology not much different from what we currently have, about which there is generally expert consensus that it lacks meaningful moral standing.

It's easy to imagine how tech companies might be motivated to encourage inflated attachment to AI systems. Attached users will have high product loyalty. They will pay for monthly subscriptions. They will buy enhancements and extras. We already see a version of this with Replika. The Emotional Alignment Design Policy puts a lid on this: It should be clear that this is an interactive teddy-bear, nothing more. Buy cute clothes for your teddy bear, sure! But forgo the $4000 cancer treatment you might give to a beloved dog.

The longer-term concern is the converse: that tech companies will be inclined to make AI systems disposable even if those AI systems, eventually, are really conscious or sentient and really deserve rights. This possibility has been imagined over and over in science fiction, from Asimov's robot stories through Star Trek: The Next Generation, Black Mirror, and West World.

Now there is, I think, one thing a bit unrealistic about those fictions: The disposable AI systems are designed to look human or humanoid in a way that engages users' sympathy. (Maybe that's a function of the fictional medium: From a fiction-writing perspective, humanlike features help engage readers' and viewers' sympathy.) More realistic, probably, is the idea that if the tech companies want to minimize annoying protests about AI rights, they will give the robots or AI systems bland, not-at-all-humanlike interfaces that minimize sympathetic reactions, such as the shipboard computer in Star Trek or the boxy robots in Interstellar.

[the boxy TARS robot from Interstellar; source]

The fundamental problem in both directions is that companies' profit incentives might misalign with AI systems' moral status. For some uses, companies might be incentivized to trick users into overattributing moral status, to extract additional money from overly attached users. In other cases, companies might be incentivized to downplay the moral status of their creations -- for example, if consciousness/sentience proves to be a useful feature to build into the most sophisticated future AI workers.

The Emotional Alignment Design Policy, if adhered to, will reduce these moral risks.


James of Seattle said...

I am absolutely in favor of the Emotional Alignment Design Policy. The hard part, of course, is determining the moral status of the machine. The danger I see is that consciousness and sentience are conflated w/ moral status. For example, I can imagine a conscious or sentient machine which has no interest in its continued existence and so would provide no moral difficulty with turning it off. The Emotional Alignment Design Policy would then suggest against giving such a machine the form of something which does have an interest in continued existence, such as any animal or human.


Paul D. Van Pelt said...

An emotional alignment design goes as far as it might. Policy is the problem, seems to me. Here is how things work in the world. There are laws, and regulations, and policies. Laws say, broadly, how things will be. Regulations more finely tune interpretation(s) of law, so that (it is hoped) those will fit particular realities.
Policy is far more local, and, locality by its' nature, weakens policy, so as to further dilute the primary fairness and intention of originating law. This entire hierarchy, while intending to benefit, ends up being detrimental. Squishy. Having a plasticity, counterproductive to the original ideas of clarity and simplicity. I was charged with writing this stuff---long before I ever met philosophy.

Philosopher Eric said...

I’m only half concerned about this question right now, or the half where chatbots and such become so convincing that they trick people into believing that they phenomenally experience their existence. Here laws could be passed which protect such machines to the detriment of people. Then once the physics of sentience does become empirically demonstrated, I expect laws to be passed which generally make it illegal to build machines which feel negative. Perhaps only a government would be permitted to go this way, or maybe a company under heavy regulation.

One point to bring up however is that mandating no AI be created of disputable moral status, implies that the concept of “morality” itself isn’t disputed. Millennia of philosophy suggest otherwise.

I consider morality to essentially exist as an evolved social too of persuasion. We feel morally wronged, for example, when social convention suggests that we’ve been morally wronged. If we take things back to the non socially determined idea of sentience itself however, and if the physics which constitute sentience were somewhat grasped, well founded policies should result. And what might we say of any repugnant implications to certain utility based policies? Given that reality itself can be repugnant, these sorts of implications should at times be expected.

Howard said...

Hi Eric

Are you hinting that AI will have a hybrid or fluid status just like animals? With some animals being human and some being things?
Why do we have to assimilate AI into prefabricated categories such as human?
If AI is unique perhaps they will grow into a category of their own, perhaps including the divine and the Leviathan

Paul D. Van Pelt said...

Well, that may not be novel. I had not heard of read it before today. A different state of evolution? Or, just a continuation of what some have implied for AI and what 'evolves' later. I can't say impossible because I am no longer sure of what impossible is. Have not thought much of Panpsychism since hearing of its' resurgence. Perhaps these notions are algorithmic---part of a very long view. And that big picture, as re-imagined by Professor Carroll, a second time around. I wonder. Does mathematics and physics predict different life forms? I expect someone will say---has already said: sure they do. Guess I'll try looking it up. Point of departure: we did not invent evolution. Shall we presume to re-invent it? That would be a whole new metaphysics, would it not?

Paul D. Van Pelt said...

I sometimes use a playful nature, even a sense of the absurd, to encourage thinkers to try harder, think better and do THEIR best with what they have and know. A philosopher I admire has done this in his own book on intuition and other tools for thinking. The discussion, ongoing here, was gently directed by some of my playfulness. I hope so anyway. Another post has asked about reality, almost simultaneously with coincidence. I did not 'play dumb' with those questions, because they exhibited a propensity towards fallacy of one sort, or dozens. If, and only if, a philosopher hopes to one day, 'make a difference that makes a difference' he or she must focus upon intent and reject interest, preference, motive (personal). See, if philosophy is not bigger than societal improvement over personal gain, it is not worth much. Contextual reality, vanity and narcissism win. '...difference that...'never had a chance. Socrates knew this.
Why else would someone commit suicide? With a good knife in hand, I would have made the morons kill me, and cut a few, in the process.

Howard said...

so if you were incarcerated and shared your cell with a rock, you'd say "hey this is a valuable rock, I'm glad we're here together"?
I'd understand on a desert island and a tree or sand, and maybe even a rock.
Sure existence itself is amazing, but it's a very low orbital and I'm unsure what the argument here is. Maybe more sleight of hand than anything

Paul D. Van Pelt said...

YAWN. Bye, Eric. it was fun for a moment. Seems the fallacy mode has infected the best blogs. I don't even know what was objectionable in what I wrote. Must have been what I said about Howard the Duck? He was a cartoon. from the 1970s. But, those remarks were not here---not now. So, why did my honesty attract deception? Sorry. I have no time for fallacy. Or misplaced insolence. Could it be complexity has overarched everything else, in the world that WE made? I think complexity is worse than say, incompatibility. Far worse, than coincidence? JeanPierre Legros said some things about that today. In any or either case, I have no time for ignorance or narcissism. Or foolishness.

chinaphil said...

I wonder if there's a slight conflict with your theory of moral mediocrity? MM theory says that in general we don't make great moral decisions because we choose only average quality. Therefore, might it not be better to heighten moral salience in product design in order to inspire, on average, better moral responses in users?
All of which becomes ridiculously muddy because we don't have much consensus on what's morally right, nor any realistic scale for measuring rights and wrongs. But I think there are some existing examples of product design for moral salience, particularly in healthcare: mandatory counselling before using an assisted suicide service, and mandatory viewing of material on alternatives before using an abortion service (controversial, I know, offered here merely as an example of an existing thing to try to ground the conversation).
In the terms that you were talking about: if we got an AI teddy/girlfriend with current, non-sentient technology, and fell in love with it, and devoted real resources to it... it's not obvious to me that that would be much of a harm. No more of a harm than people spending money on their cars, which is not a moral question for most people. Whereas the AI Lives Matter case does seem like a big moral harm.
So there are a few factors that suggest to me that a perfect alignment between the moral reality of a thing and the moral feel of a thing may not be the best goal. In fact, it may be that we need in general much more moral prompting, not less; so a general policy of increasing moral feel might be a better policy.

Eric Schwitzgebel said...

Thanks for the comments, everyone, and sorry about the slow reply. I was traveling until yesterday.

James: I worry that a conscious machine with no interest in its continued existence might still warrant substantial moral concern even if it itself doesn't think it does. I guess I'm inclined to think that consciousness, and not more specifically conscious sentience, is important.

Paul: I agree that "policy" has a bit of a local feel to it. I guess I'm inclined to think that "policy" is right choice nonetheless, since implementation will be squishy and local and in some cases overridden by other considerations. As far as I'm concerned, you didn't say anything offensive. Nothing wrong with a bit of play!

Philosopher Eric: I'm more of a moral realist than you are, I suppose. Your comment brings out that there are two sources of uncertainty here: uncertainty about moral theory and uncertainty about whether the entity meets the criteria for rights according to the chosen moral theory. My inclination is to think that both should feed the Design Policy of the Excluded Middle -- so that if there's uncertainty according to a leading moral theory, we should avoid creating the AI in question. (We shouldn't be too permissive about what counts as a leading moral theory, however, lest the policy become too restrictive.)

Howard: I agree. Some future AI probably won't fit precisely into any of our existing categories. On the value of rocks: I think humans are much more valuable, even though rocks have some intrinsic value.

Chinaphil: I hadn't considered the connection with moral mediocrity. Interesting suggestion to enhance the moral feel. My worry here is partly about sacrificing human interests for entities not worth the sacrifice. You're right, of course, that we spend money on cars, etc., and -- unless one goes toward Singer -- this doesn't seem to be a moral problem as long as it's within reason. It seems to me that the sociality or attribution of consciousness/sentience adds a level to issue that isn't present for the car, making it qualitatively different. I also worry that a policy of enhancing moral reactiveness might play into corporate incentives to make us attach to AI systems so that we will pay more than is reasonable for subscriptions and upgrades. On the other hand, you do have a point that we probably tend to be underreactive, and it might make sense to compensate for that somehow.

Arnold said...

AI machine thermal dynamics, AI ethical mortals analysis, equal energy's useful origins...
...balance: positive negative neutral, active passive neutral, for movement...

Tim Smith said...


The Emotional Alignment Design Policy is fraught with the uncertainty of emotional reactions, and I would push for a lower bar of possible moral standing. If you are talking in hypotheticals, I will side with Paul that we need a better return on investment from this thought experiment. But your use case is more real daily. Many storylines portray these struggles, as you mention.

I'm thinking of Tom Hanks's character Chuck Noland and his friend Wilson who was a volleyball in the movie Cast Away, and also Kevin Pearson, played by Justin Hartley in the series 'This is Us', who attaches significance to his father's necklace. In both cases, the attachment is displacement, though one is personified and the other just a rock. I'm not above the model that all emotional and moral concerns are displaced and that this is the baseline we need to tread. We can't know the importance of 'The Other' even if that other is a necklace or spec of dust, as people vary in their self-awareness if not standards.

As for commercial AI and the disclaimers required, these concerns are best expressed with this baseline. Recently I purchased some nail-less picture hangers with the direction – do not hang irreplaceable objects in the instructions (not a disclaimer) even if the object was well within the weight spec of the stick-on hanger. Placing this disclaimer clearly in the user handbook is likely the best course for marketing AI bots and applications to consumers. Assume people can attach to volleyballs or rocks.

For AI to claim moral standing isn't for humans to decide. Unlike in the case of animals, where communication is unclear, we can let the AI make its case here as long as the program is not biased by its creator, which is impossible, but we could get close. Bots are not animals, after all, and they can tell us what they think; in fact, that might be a baseline for this quandary right there.

The current chat AI offerings are sparking interest in this past work. The approach is good to consider, even if we aren't quite there yet.

Thanks for re-posting this theory,