Tuesday, July 23, 2024

A Metaethics of Alien Convergence

I'm not a metaethicist, but I am a moral realist (I think there are facts about what really is morally right and wrong) and also -- bracketing some moments of skeptical weirdness -- a naturalist (I hold that scientific defensibility is essential to justification).  Some people think that moral realism and naturalism conflict, since moral truths seem to lie beyond the reach of science.  They hold that science can discover what is, but not what ought to be, that it can discover what people regard as ethical or unethical, but not what really is ethical or unethical.

Addressing this apparent conflict between moral realism and scientific naturalism (for example, in a panel discussion with Stephan Wolfram and others a few months ago), I find I have a somewhat different metaethical perspective than others I know.

Generally speaking, I favor what we might call a rational convergence model, in broadly the vein of Firth, Habermas, Railton, and Scanlon (bracketing what, to insiders, will seem like huge differences).  An action is ethically good if it is the kind of action people would tend on reflection to endorse.  Or, more cautiously, if it's the kind of action that certain types of observers, in certain types of conditions, would tend, upon certain types of reflection, to converge on endorsing.

Immediately, four things stand out about this metaethical picture:

(1.) It is extremely vague.  It's more of a framework for a view than an actual view, until the types of observers, conditions, and reflection are specified.

(2.) It might seem to reverse the order of explanation.  One might have thought that rational convergence, to the extent it exists, would be explained by observers noticing ethical facts that hold independently of any hypothetical convergence, not vice versa.

(3.) It's entirely naturalistic, and perhaps for that reason disappointing to some.  No non-natural facts are required.  We can scientifically address questions about what conclusions observers will tend to converge on.  If you're looking for a moral "ought" that transcends every scientifically approachable "is" and "would", you won't find it here.  Moral facts turn out just to be facts about what would happen in certain conditions.

(4.) It's stipulative and revisionary.  I'm not saying that this is what ordinary people do mean by "ethical".  Rather, I'm inviting us to conceptualize ethical facts this way.  If we fill out the details correctly, we can get most of what we should want from ethics.

Specifying a bit more: The issue to which I've given the most thought is who are the relevant observers whose hypothetical convergence constitutes the criterion of morality?  I propose: developmentally expensive and behaviorally sophisticated social entities, of any form.  Imagine a community not just of humans but of post-humans (if any), and alien intelligences, and sufficiently advanced AI systems, actual and hypothetical.  What would this diverse group of intelligences tend to agree on?  Note that the hypothesized group is broader than humans but narrower than all rational agents.  I'm not sure any other convergence theorist has conceptualized the set of observers in exactly this way.  (I welcome pointers to relevant work.)

[Dall-E image of a large auditorium of aliens, robots, humans, sea monsters, and other entities arguing with each other]

You might think that the answer would be the empty set: Such a diverse group would agree on nothing.  For any potential action that one alien or AI system might approve of, we can imagine another alien or AI system who intractably disapproves of that action.  But this is too quick, for two reasons:

First, my metaethical view requires only a tendency for members of this group to approve.  If there are a few outlier species, no problem, as long as approval would be sufficiently widespread in a broad enough range of suitable conditions.

(Right, I haven't specified the types of conditions and types of reflection.  Let me gesture vaguely toward conditions of extended reflection involving exposure to a wide range of relevant facts and exposure to a wide range of alternative views, in reflective conditions of open dialogue.)

Second, as I've emphasized, though the group isn't just humans, not just any old intelligent reasoner gets to be in the club.  There's a reason I specify developmentally expensive and behaviorally sophisticated social entities.  Developmental expense entails that life is not cheap.  Behavioral sophistication entails (stipulatively, as I would define "behavioral sophistication") a capacity for structuring complex long-term goals, coordinating in sophisticated ways with others, and communicating via language at least as expressively flexible and powerful as human language.  And sociality entails that such sophisticated coordination and communication happens in a complex, stable, social network of some sort.

To see how these constraints generate predictive power, consider the case of deception.  It seems clear that any well-functioning society will need some communicative norms that favor truth-telling over deceit, if the communication is going to be useful.  Similarly, there will need to be some norms against excessive freeloading.  These needn't be exceptionless norms, and they needn't take the same form in every society of every type of entity.  Maybe, even, there could be a few rare societies where deceiving those who are trying to cooperate with you is the norm; but you see how it would probably require a rare confluence of other factors for a society to function that way.

Similarly, if the entities are developmentally expensive, a resource-constrained society won't function well if they are sacrificed willy-nilly without sufficient cause.  The acquisition of information will presumably also tend to be valued -- both short-term practically applicable information and big-picture understandings that might yield large dividends in the long term.  Benevolence will be valued, too: Reasoners in successful societies will tend to appreciate and reward those who help them and others on whom they depend.  Again, there will be enormous variety in the manifestation of the virtues of preserving others, preserving resources, acquiring knowledge, enacting benevolence, and so on.

Does this mean that if the majority of alien lifeforms breathe methane, it will be morally good to replace Earth's oxygen with methane?  Of course not!  Just as a cross-cultural collaboration of humans can recognize that norms should be differently implemented in different cultures when conditions differ, so also will recognition of local conditions be part of the hypothetical group's informed reflection concerning the norms on Earth.  Our diverse group of intelligent alien reasoners will see the value of contextually relativized norms: On Earth, it's good not to let things get too hot or too cold.  On Earth, it's good for the atmosphere to have more oxygen than methane.  On Earth, given local biology and our cognitive capacities, such-and-such communicative norms seem to work for humans and such-and-such others not to work.

Maybe some of these alien reasoners would be intractably jingoistic: Antareans are the best and should wipe out all other species!  It's a heinous moral crime to wear blue!  My thought is that in a diverse group of aliens, given plenty of time for reflection and discussion, and the full range of relevant information, such jingoistic ideas will overall tend to fare poorly with a broad audience.

I'm asking you to imagine a wide diversity of successfully cooperative alien (and possibly AI) species -- all of them intelligent, sophisticated, social, and long-lived -- looking at each other and at Earth, entering conversation with us, patiently gathering the information they need, and patiently ironing out their own disagreements in open dialogue.  I think they will tend to condemn the Holocaust and approve of feeding your children.  I think we can surmise this by thinking about what norms would tend to arise in general among developmentally expensive, behaviorally sophisticated social entities, and then considering how intelligent, thoughtful entities would apply those norms to the situation on Earth, given time and favorable conditions to reflect.  I propose that we think of an action as "ethical" or "unethical" to the extent it would tend to garner approval or disapproval under such hypothetical conditions.

It needn't follow that every act is determinately ethically good or bad, or that there's a correct scalar ranking of the ethical goodness or badness of actions.  There might be persistent disagreements even in these hypothesized circumstances.  Maybe there would be no overall tendency toward convergence in puzzle cases, or tragic dilemmas, or when important norms of approximately equal weight come into conflict.  It's actually, I submit, a strength of the alien convergence model that it permits us to make sense of such irresolvability.  (We can even imagine the degree of hypothetical convergence varying independently of goodness and badness.  About Action A, there might be almost perfect convergence on its being a little bit good.  About Action B, in contrast, there might be 80% convergence on its being extremely good.)

Note that, unlike many other naturalistic approaches that ground ethics specifically in human sensibilities, the metaethics of alien convergence is not fundamentally relativistic.  What is morally good depends not on what humans (or aliens) actually judge to be good but rather on what a hypothetical congress of socially sophisticated, developmentally expensive humans, post-humans, aliens, sufficiently advanced AI, and others of the right type would judge to be good.  At the same time, this metaethics avoids committing to the implausible claim that all rational agents (including short-lived, solitary ones) would tend to or rationally need to approve of what is morally good.

Wednesday, July 10, 2024

How the Mimicry Argument Against Robot Consciousness Works

A few months ago on this blog, I presented a "Mimicry Argument" against robot consciousness -- or more precisely, an argument that aims to show why it's reasonable to doubt the consciousness of an AI that is built to mimic superficial features of human behavior.  Since then, my collaborator Jeremy Pober and I have presented this material to philosophy audiences in Sydney, Hamburg, Lisbon, Oxford, Krakow, and New York, and our thinking has advanced.

Our account of mimicry draws on work on mimicry in evolutionary biology.  On our account, a mimic is an entity:

  • with a superficial feature (S2) that is selected or designed to resemble a superficial feature (S1) of some model entity
  • for the sake of deceiving, delighting, or otherwise provoking a particular reaction in some particular audience or "receiver"
  • because the receiver treats S1 in the model entity as an indicator of some underlying feature F.
Viceroy butterflies have wing coloration patterns (S2) that resemble the wing color patterns (S1) of monarch butterflies for the sake of misleading predators who treat S1 as an indicator of toxicity.  Parrots emit songs that resemble the songs or speech of other birds or human caretakers for social advantage.  If the receiver is another parrot, the song in the model (but not necessarily the mimic) indicates group membership.  If the receiver is a human, the speech in the model (but not necessarily the mimic) indicates linguistic understanding.  As the parrot case illustrates, not all mimicry needs to be deceptive, and the mimic might or might not possess the feature the receiver attributes.

Here's the idea in a figure:


Pober and I define a "consciousness mimic" as an entity whose S2 resembles an S1 that, in the model entity, normally indicates consciousness.  So, for example, a toy which says "hello" when powered on is a consciousness mimic: For the sake of a receiver (a child), it has a superficial feature (S2, the sound "hello" from its speakers) which resembles a superficial feature (S1) in an English speaking human which normally indicates consciousness (since humans who say "hello" are normally conscious).

Arguably, Large Language Models like ChatGPT are consciousness mimics in this sense.  They emit strings of text modeled on human-produced text for the sake of users who interpret that text as having semantic content of the same sort such text normally does when emitted by conscious humans.

Now, if something is a consciousness mimic, we can't straightforwardly infer its consciousness from its possession of S2 in the same way we can normally infer the model's consciousness from its presence in S1.  The "hello" toy isn't conscious.  And if ChatGPT is conscious, that will require substantial argument to establish; it can't be inferred in the same ready way that we infer consciousness in a human from human utterances.

Let me attempt to formalize this a bit:

(1.) A system is a consciousness mimic if:
a. It possesses superficial features (S2) that resemble the superficial features (S1) of a model entity.
b. In the model entity, the possession of S1 normally indicates consciousness.
c. The best explanation of why the mimic possesses S2 is the mimicry relationship described above.

(2.) Robots or AI systems – at least an important class of them – are consciousness mimics in this sense.

(3.) Because of (1c), if a system is a consciousness mimic, inference to the best explanation does not permit inferring consciousness from its possession of S2.

(4.) Some other argument might justify attributing consciousness to the mimic; but if the mimic is a robot or AI system, any such argument, for the foreseeable future, will be highly contentious.

(5.) Therefore, we are not justified in attributing consciousness to the mimic.

AI systems designed with outputs that look human might understandably tempt users to attribute consciousness based on those superficial features, but we should be cautious about such attributions.  The inner workings of Large Language Models and other AI systems are causally complex and designed to generate outputs that look like the types of outputs humans produce, for the sake of being interpretable by humans; but not all causal complexity implies consciousness and the superficial resemblance to the behaviorally sophisticated patterns we associate with consciousness could be misleading if such patterns could potentially arise without the presence of consciousness.

The main claim is intended to be weak and uncontroversial: When the mimicry structure is present, significant further argument is required before attributing consciousness to an AI system based on superficial features suggestive of consciousness.

Friends of robot or AI consciousness may note two routes by which to escape the Mimicry Argument.  They might argue, contra premise (2), that some important target types of artificial systems are not consciousness mimics.  Or they might present an argument that the target system, despite being a consciousness mimic, is also genuinely conscious – an argument they believe is uncontentious (contra 4) or that justifies attributing consciousness despite being contentious (contra the inference from 4 to 5).

The Mimicry Argument is not meant to apply universally to all robots and AI systems.  Its value, rather, is to clarify the assumptions implicit in arguments against AI consciousness on the grounds that AI systems merely mimic the superficial signs of consciousness.  We can then better see both the merits of that type of argument and means of resisting it.

Wednesday, July 03, 2024

Color the World

My teenage daughter's car earns a lot of attention on the street:







People honk and wave, strangers ask to add their own art, five-year-olds drop their toys and gawk.  A few people look annoyed and turn away.  (Kate describes her car as a "personality tester".)

A couple of years ago, I had promised my 2009 Honda Accord to Kate when she earned her driver's license.  But knowing that Kate cares about appearances -- stylish clothes and all that -- I promised that I'd have it repainted first, since the paint jobs on these old Hondas age badly in the southern California sun.  But when I saw the cost of a proper paint job, I was shocked.  So I suggested that we turn it into an art car, which she and her friends could decorate at will.  (In the 1980s, a few of my friends and I did the same with our old beater cars, generating "Motorized Cathedrals of the Church of the Mystical Anarchist" numbers 2 through 4 -- number 1, of course, being Earth itself.)  She accepted my offer, we bought paints, and voila, over the months the art has accumulated!

I'm not sure exactly what makes the world intrinsically valuable.  I reject hedonism, on which the only intrinsically valuable thing is pleasure; I'm inclined to think that a diversity of flourishing life is at least as important.  (Consider what one would benevolently hope for on a distant planet.  I'd hope not that it's just a sterile rock, but richly populated with diverse life, including, ideally, rich societies with art, science, philosophy, sports, and varied cultures and ecosystems.)

Multitudinous brown 2009 Honda Accords populate the roads of America.  What a bland, practical car!  The world is richer -- intrinsically better -- for containing Kate's weird variant.  She and her friends have added color to the world.

We might generalize this to a motto: Color the World.

It doesn't have to be a car, of course.  Your creative uniqueness might more naturally manifest in other forms (and it's reasonable to worry about resale value).  It might be tattoos on your body, unusual clothing, the way you decorate your office, house, or yard.  It might be your poetry (even secret poetry, seen by no one else and immediately destroyed, timelessly enriches the fabric of the world), your music, your philosophical prose, your distinctive weirdness on social media.  It might be the unusual way you greet people, your quirky manifestation of the rituals of religion or fandom or parenthood, your taste in metaphor, the way you skip down the street, your puns and dad jokes, your famous barbecue parties.

It would be superhuman to be distinctively interesting in all these domains at once, and probably narcissistic even to try.  Sometimes it's best to be the straight man in boring clothes -- a contrast against which the dazzlingly dressed shine more brightly.  But I think most of us hold back more than we need to, for lack of energy and fear of standing out.  Hoist your freak flag!

I see three dimensions of excellence in coloring the world:

(1.) Your color must be different than the others around you, in a way that stands out, at least to the attentive.  If everyone has a brown Honda, having the only green one already adds diversity, even if green is no intrinsically better than brown.  If baseball hats are abundant, adding another one to the mix doesn't add color; but being the one baseball hat in a sea of fedoras does (and vice versa of course).

(2.) Your color should ideally express something distinctive about you.  While you might choose a baseball hat to contrast with the fedoras simply because it's different, ideally you choose it because it also discloses an underlying difference between you and the others -- maybe you are the ragingest baseball fan in the group.  Your moon-and-cat tattoo isn't just different but manifests your special affection for cats in moonlight.  Your dad jokes wink with a je ne sais quoi that your friends all instantly recognize.

(3.) Your color should ideally arise from your creative energy.  A baseball cap from the merch store might (in some contexts) be color -- but a cap modified by your own hand is more colorful.  Let it be, if it can, your own artistic endeavor, your paint and brush, your own selection of words, your own decisions about how best to embody the ritual, organize the party, structure the space.  If it's prepackaged, put it together a little differently, or contextualize or use it a little differently.

Can I justify these subsidiary principles by appeal to the ideal of diversity?  Maybe!  Diversity occupies a middle space between bland sameness and chaotic white noise.  By grounding your difference in your distinctive features and your creative inspiration, you ensure the structure, order, and significance that distinguishes meaningful diversity from random variation.

I imagine someone objecting to Color the World by counterposing the motto Walk Lightly.  I do feel the pull of Walk Lightly.  Don't make a big fuss.  Let things be.  No need to scratch your name on every tree and upturn all the sand on the beach.  Walk Lightly, perhaps, manifests respect for the color that others bring to the world.  Fair enough.  Make good decisions about when and where to color.

Goodbye for today!  Time to drive my own bland car home.  When I walk in my front door, I'll ritualistically confirm with my wife and daughter that they abided by my usual morning advice to them: (1.) no barfing, and (2.) don't get abducted by aliens.