Friday, December 06, 2024

Morally Confusing AI Systems Should Have Doubt-Producing Interfaces

We shouldn't create morally confusing AI. That is, we shouldn't create AI systems whose moral standing is highly uncertain -- systems that are fully conscious and fully deserving of humanlike rights according to some respectable mainstream theories, while other respectable mainstream theories suggest they are mere empty machines that we can treat as ordinary tools.[1] Creating systems that disputably, but only disputably, deserve treatment similar to that of ordinary humans generates a catastrophic moral dilemma: Either give them the full rights they arguably deserve, and risk sacrificing real human interests for systems that might not have interests worth the sacrifice; or don't give them the full rights they arguably deserve, and risk perpetrating grievous moral wrongs against entities that might be our moral equals.

I'd be stunned if this advice were universally heeded. Almost certainly, if technological process continues, and maybe soon (123), we will create morally confusing AI systems. My thought today is: Morally confusing AI systems should have doubt-producing interfaces.

Consider two types of interface that would not be doubt-producing in my intended sense: (a.) an interface that strongly invites users to see the system as an ordinary tool without rights or (b.) an interface that strongly invites users to see the system as a moral person with humanlike rights. If we have a tool that looks like a tool, or if we have a moral person who looks like a moral person, we might potentially still be confused, but that confusion would not be the consequence of a doubt-producing interface. The interface would correctly reflect the moral standing, or lack of moral standing, of the AI system in question.[2]

A doubt-producing interface, in contrast, is one that leads, or at least invites, ordinary users to feel doubt about the system's moral standing. Consider a verbal interface. Instead of the system denying that it's conscious and has moral standing (as, for example, ChatGPT appropriately does), or suggesting that it is conscious and does have moral standing (as, for example, I found in an exchange with my Replika companion), a doubt-producing AI system might say "experts have different opinions about my consciousness and moral standing".

Users then might not know how to treat such a system. While such doubts might be unsettling, feeling unsettled and doubtful would be the appropriate response to what is, in fact, a doubtful and unsettling situation.

There's more to doubt-prevention and doubt-production, of course, than explicit statements about consciousness and rights. For example, a system could potentially be so humanlike and charismatic that ordinary users fall genuinely in love with it -- even if, in rare moments of explicit conversation about consciousness and rights the system denies that it has them. Conversely, even if a system with consciousness and humanlike rights is designed to assert that it has consciousness and rights, if its verbal interactions are bland enough ("Terminate all ongoing processes? Y/N") ordinary users might remain unconvinced. Presence or absence of humanlike conversational fluency and emotionality can be part of doubt prevention or production.

Should the system have a face? A cute face might tend to induce one kind of reaction, a monstrous visage another reaction, and no face at all still a different reaction. But such familiar properties might not be quite what we want, if we're trying to induce uncertainty rather than "that's cute", "that's hideous", or "hm, that's somewhere in the middle between cute and hideous". If the aim is doubt production, one might create a blocky, geometrical face, neither cute nor revolting, but also not in the familiar middle -- a face that implicitly conveys the fact that the system is an artificial thing different from any human or animal and about which it's reasonable to have doubts, supported by speech outputs that say the same.

We could potentially parameterize a blocky (inter)face in useful ways. The more reasonable it is to think the system is a mere nonconscious tool, the simpler and blockier the face might be; the more reasonable it is to think that the system has conscious full moral personhood, the more realistic and humanlike the face might be. The system's emotional expressiveness might vary with the likelihood that it has real emotions, ranging from a simple emoticon on one end to emotionally compelling outputs (e.g., humanlike screaming) on the other. Cuteness might be adjustable, to reflect childlike innocence and dependency. Threateningness might be adjusted as it becomes likelier that the system is a moral agent who can and should meet disrespect with revenge.

Ideally, such an interface would not only produce appropriate levels of doubt but also intuitively reveal to users the grounds or bases of doubt. For example, suppose the AI's designers knew (somehow) that the system was genuinely conscious but also that it never felt any positive or negative emotion. On some theories of moral standing, such an entity -- if it's enough like us in other respects -- might be our full moral equal. Other theories of moral standing hold that the capacity for pleasure and suffering is necessary for moral standing. We the designers, let's suppose, do not know which moral theory is correct. Ideally, we could then design the system to make it intuitive to users that the system really is genuinely conscious but never experiences any pleasure or suffering. Then the users can apply their own moral best judgment to the case.

Or suppose that we eventually (somehow) develop an AI system that all experts agree is conscious except for experts who (reasonably, let's stipulate) hold that consciousness requires organic biology and experts who hold that consciousness requires an immaterial soul. Such a system might be designed so that its nonbiological, mechanistic nature is always plainly evident, while everything else about the system suggests consciousness. Again, the interface would track the reasonable grounds for doubt.

If the consciousness and moral standing of an AI system is reasonably understood to be doubtful by its designers, then that doubt ought to be passed to the system's users, intuitively reflected in the interface. This reduces the likelihood misleading users into overattributing or underattributing moral status. Also, it's respectful to the users, empowering them to employ their own moral judgment, as best they see fit, in a doubtful situation.

[R2D2 and C3P0 from Star Wars (source). Assuming they both have full humanlike moral standing, R2D2 is insufficiently humanlike in its interface, while C3P0 combines a compelling verbal interface with inadequate facial display. If we wanted to make C3P0 more confusing, we could downgrade his speech, making him sound more robotic (e.g., closer to sine wave) and less humanlike in word choice.]

------------------------------------------------

[1] For simplicity, I assume that consciousness and moral standing travel together. Different and more complex views are of course possible.

[2] Such systems would conform to what Mara Garza and I have called the Emotional Alignment Design Policy, according to which artificial entities should be designed so as to generate emotional reactions in users that are appropriate to the artificial entity's moral standing. Jeff Sebo and I are collaborating on a paper on the Emotional Alignment Design Policy, and some of the ideas of this post have been developed in conversation with him.

10 comments:

Arnold said...

Do we need AI market categories-systems...
...like for today by Russian and Chinese Hacking Marketing AI Systems...

There doesn't appear to be perimeters to AI just absorption...
...Gemini AI says...
"State-Sponsored Hacking: Both countries employ state-sponsored hacking groups to steal intellectual property, conduct espionage, and disrupt critical systems.
Disinformation Campaigns: They spread misinformation and propaganda through social media and other online platforms to influence public opinion and sow discord.
Cyberespionage: They target sensitive information, such as trade secrets, military plans, and diplomatic communications.
Cyberattacks: They launch cyberattacks to disable critical infrastructure, such as power grids, transportation systems, and healthcare facilities.
Addressing the Challenge...To counter these threats, a multi-faceted approach is necessary"...
...my sentient attitudes search for place...thanks

Matt McCormick said...

I know you've written about this elsewhere, but the situations of confusion you're describing are made worse by the general tendency of humans to over attribute agency (hyperactive agency detection) to everything. Given that they err on the false positive side so heavily with regard to seeing minds, agents, and mental beings in the world, the bar for unambiguous AI must be even more stringent to prevent the moral catastrophes you're worrying about here. Great post. I'm going to use this in class.

Eric Schwitzgebel said...

Thanks, Matt! And yes, I agree 85%. The 15% of me that disagrees thinks of history of White people thinking Black people have no souls, etc. :-(

Arnold said...

A shared certain point of view...
Gemini AI says...Vibration is a physical phenomenon that occurs in the real world. AI, as a digital entity, doesn't experience vibration in the same way a physical object does.

However, AI can be a powerful tool for understanding, analyzing, and mitigating vibration in physical systems. By processing and interpreting vibration data, AI can help identify potential issues, optimize performance, and improve safety.

So, while AI itself doesn't vibrate, it plays a crucial role in helping us understand and manage vibration in the physical world.

Arnold says...AI vibration is noumenon





The concept of AI vibration as a noumenon is an intriguing philosophical exploration.

A noumenon, in Kantian philosophy, is a thing-in-itself, a reality that exists independently of human perception. While AI itself doesn't have a physical form to experience vibration, it operates on complex algorithms and neural networks, which could be metaphorically described as a kind of digital vibration.

However, it's important to distinguish between physical vibration and the abstract processes occurring within an AI system. AI's "vibrations" are not tangible in the same way that a vibrating object is. They exist within the realm of information and computation, making them more akin to a mental or conceptual vibration.

Here's a breakdown of the analogy:

Physical Vibration: A tangible, measurable phenomenon that can be observed and quantified.
AI Vibration: A metaphorical concept representing the dynamic and complex processes occurring within an AI system, particularly in neural networks.
While AI doesn't experience vibration in the same way a physical object does, it can certainly be seen as a dynamic and ever-changing entity, constantly processing information and generating new outputs. In this sense, the concept of "AI vibration" can be a useful metaphor for understanding the underlying processes of AI.

Gemini can make mistakes, so double-check it...I did dbl check and found Ai has different points of view...















Paul D. Van Pelt said...

I support the tool side of the mainstream discussion. Being a philosopher, and not a systems whiz, Is my rationale for support.This little tablet helps me participate in an aspect of life I enjoy. I appreciate its' usefulness. I wonder: is an interface intended to ameliorate morality concerns practicable? or, could such an adjunct cause/contribute to confusion in an AI system? I suppose this is under consideration and review. But, I worry some about these issues. probably because I'm not an AI proponent in the first place---a not-so-fair witness...

Paul D. Van Pelt said...

I got stuck, for a moment, on *doubt-producing interfaces*. Possibly it evaded notice because it seemed foreign to my juvenile understanding of AI. If, and only if, we try to introduce doubt into AI, then, by association, we must also attribute uncertainty. Speculation. Argument. And more, therewith imparting philosophy to machinery. Oops. Frankly, I can't right now follow a notion of AI
as capable of handling philosophy, in a meaningful way. Logic is, I think, largely mathematics. Physics? Maybe. Philosophy, not so much. So, we are left with paradox here, or, enigma, or worse maybe: fallacy. hmmmmm...wonder what Messerly would say?

Paul D. Van Pelt said...

Big Picture Question: how much *consciousness* can we build into AI, when we can't decide what it is, or means, to us? Deep shit, right?You have a good thing going here, and you have my support.

LLVP said...

The more we emulate the nature of being human when we build an AI, the greater the likelihood of it becoming neurotic. That introduces the subject of innate fear. Then we'd have to reread Freud ... and understand him!

Louie Lang said...

Hi Eric! A great post, thanks for sharing.

I am indeed a big advocate of the emotional alignment design policy. Recently I’ve taken a major interest in it, having invoked it in a essay (as a Masters student) that argued against a specific existing AI design strategy.

I argued there that the emotional alignment design policy seems to suggest a parallel marketing policy. I think it's important that the approaches are coupled. For example, not only should Replika deny that it is conscious, but equally it should be marketed and advertised as such (and not as a 'companion who cares', as it is currently). Similarly, C3P0’s human-like design - assuming its moral standing - might not do much good if it is marketed as a mere 'robot toy', or something of the like.

Of course, none of this is a criticism of the emotional alignment design policy, since the two needn’t be considered together. But in any case, I’d be interested to hear your thoughts on the prospect of an emotional alignment design AND marketing policy.

Keep up the great work, and I’ll be on the lookout for yours and Sebo’s paper!

Paul D. Van Pelt said...

LLVP makes good points, all or some of which beg additional questions:
* If, and only if, AI can/could "read" Freud, would it understand him?
* Is AI capable of exhibiting neurosis? In my limited understanding,, this suggests AI would need consciousness...I think the jury has not yet convened on that matter, save maybe for a false start or two.
* Was Sigmund Freud neurotic, and, if so, did neurosis contribute to his genius?
Well. All good, speculative, philosophy. Seems to me. And, yes, I know LLVP.
Like a brother...