Thursday, December 29, 2022

The Moral Status of Alien Microbes, Plus a Thought about Artificial Life

Some scientists think it's quite possible we will soon find evidence of microbial life in the Solar System, if not on Mars, then maybe in the subsurface oceans of a gas giant's icy moon, such as Europa, Enceladus, or Titan. Suppose we do find alien life nearby. Presumably, we wouldn't or shouldn't casually destroy it. Perhaps the same goes for possible future artificial life systems on Earth.

Now you might think that alien microbes would have only instrumental value for human beings. Few people think that Earthly microbes have intrinsic moral standing or moral considerability for their own sake. There's no "microbe rights" movement, and virtually no one feels guilty about taking an antibiotic to fight a bacterial infection. In contrast, human beings have intrinsic moral considerability: Each one of us matters for our own sake, and not merely for the sake of others.

Dogs also matter for their own sake: They can feel pleasure and pain, and we ought not inflict pain on them unnecessarily. Arguably the same holds for all sentient organisms, including lizards, salmon, and lobsters, if they are capable of conscious suffering, as many scientists now think.

But microbes (presumably!) don't have experiences. They aren't conscious. They can't genuinely suffer. Nor do they have the kinds of goals, expectations, social relationships, life plans, or rational agency that we normally associate with being a target of moral concern. If they matter, you might think, they matter only to the extent they are useful for our purposes -- that is, instrumentally or derivatively, in the way that automobiles, video games, and lawns matter. They matter only because they matter to us. Where would be without our gut microbiome?

If so, then you might think that alien microbes would also matter only instrumentally. We would and should value them as a target of scientific curiosity, as proof that life can evolve in alien environments, and because by studying them we might unlock useful future technologies. But we ought not value them for their own sake.

[An artist's conception of life on Europa] 

Now in general, I think that viewpoint is mistaken. I am increasingly drawn to the idea that everything that exists, even ordinary rocks, has intrinsic value. But even if you don't agree with me about that, you might hesitate to think we should feel free to extinguish alien microbes if it's in our interest. You might think that if we were to find simple alien life in the oceans of Europa, that life would merit some awe, respect, and preservation, independently of their contribution to human interests.

Environmental ethicists and deep ecologists see value in all living systems, independent of their contribution to human interests -- including in life forms that aren't themselves capable of pleasure or pain. It might seem radical to extend this view to microbes; but when the microbes are the only living forms in an entire ecosystem, as they might be an another planet in the Solar System, the idea of "microbe rights" maybe gains some appeal.

I'm not sure exactly how to argue for this perspective, other than just to invite you to reflect on the matter. Perhaps the distant planet thought experiment will help. Consider a far away planet we will never interact with. Would it be better for it to be a sterile rock or for it to have life? Or consider two possible universes, one containing only a sterile planet and one containing a planet with simple life. Which is the better universe? The planet or universe with life is, I propose, intrinsically better.

So also: The universe is better, richer, more beautiful, more awesome and amazing, if Europa has microbial life beneath its icy crust than if it does not. If we then go and destroy that life, we will have made the universe a worse place. We ought not put the Europan ecosystem at risk without compelling need.

I have been thinking about these issues recently in connection with reflections on the possible moral status of artificial life. Artificial life is life, or at least systems that important ways resemble life, created artificially by human engineers and researchers. I'm drawn to the idea that if alien microbes or alien ecosystems can have intrinsic moral considerability, independent of sentience, suffering, consciousness, or human interests, then perhaps sufficiently sophisticated artificial life systems could also. Someday artificial life researchers might create artificial ecosystems so intricate and awesome that they are the ethical equivalent of an alien ecology, right here on Earth, as worth preserving for their own sake as the microbes of Europa would be.

Thursday, December 22, 2022

The Moral Measurement Problem: Four Flawed Methods

[This post draws on ideas developed in collaboration with psychologist Jessie Sun.]

So you want to build a moralometer -- that is, a device that measures someone's true moral character? Yes, yes. Such a device would be so practically and scientifically useful! (Maybe somewhat dystopian, though? Careful where you point that thing!)

You could try to build a moralometer by one of four methods: self-report, informant report, behavioral measurement, or physiological measurement. Each presents daunting methodological challenges.

Self-report moralometers

To find out how moral a person is, we could simply ask them. For example, Aquino and Reed 2002 ask people how important it is to them to have various moral characteristics, such as being compassionate and fair. More directly, Furr and colleagues 2022 have people rate the extent to which they agree with statements such as "I would say that I am a good person" and "I tend to act morally".

Could this be the basis of a moralometer? That depends on the extent to which people are able and willing to report on their overall morality.

People might be unable to accurately report their overall morality.

Vazire 2010 has argued that self-knowledge of psychological traits tends to be poor when the traits are highly evaluative and not straightforwardly observable (e.g., "intelligent", "creative"), since under those conditions people are (typically) motivated to see themselves favorably and -- due to low observability -- not straightforwardly confronted with the unpleasant news they would prefer to deny.

One's overall moral character is evaluatively loaded if anything is. Nor is it straightforwardly observable. Unlike height or talkativeness, someone motivated not to see themselves as, say, unfair or a jerk can readily find ways to explain away the evidence (e.g., "she deserved it", "I'm in such a hurry").

Furthermore, it sometimes requires a certain amount of moral insight to distinguish morally good from morally bad behavior. Part of being a sexist creep is typically not seeing anything wrong with the kinds of things that sexist creeps typically do. Conversely, people who are highly attuned to how they are treating others might tend to beat themselves up over relatively small violations. We might thus expect a moral Dunning-Kruger effect: People with bad moral character might disproportionately overestimate their moral character, so that people's self-opinions tend to be undiagnostic of the actual underlying trait.

Even to the extent people are able to report their overall morality, people might be unwilling to report it.

It's reasonable to expect that self-reports of moral character would be distorted by socially desirable responding, the tendency for questionnaire respondents to answer in a manner that they believe will reflect well on them. To say that you are extremely immoral seems socially undesirable. We would expect that people (e.g., Sam Bankman-Fried) would tend to want to portray themselves as morally above average. On the flip side, to describe oneself as "extremely moral" (say, 100 on a 0-100 scale from perfect immorality to perfect morality) might come across as immodest. So even people who believe themselves to be tip-top near-saints might not frankly express their high self-opinions when directly asked.

Reputational moralometers

Instead of asking people to report on their own morality, could we ask other people who know them? That is, could we ask their friends, family, neighbors, and co-workers? Presumably, the report would be less distorted by self-serving or ego-protective bias. There's less at stake when judging someone else's morality than when judging your own. Also, we could aggregate across multiple informants, combining several different people's ratings, possibly canceling out some sources of noise and bias.

Unfortunately, reputational moralometers -- while perhaps somewhat better than self-report moralometers -- also present substantial methodological challenges.

The informant advantage of decreased bias could be offset by a corresponding increased in ignorance.

Informants don't observe all of the behavior of the people whose morality they are judging, and they have less access to the thoughts, feelings, and motivations that are relevant to the moral assessment of behavior. Informant reports are thus likely to be based only on a fraction of the evidence that self-report would be based on. Moreover, people tend to hide their immoral behaviors, and presumably some people are better at doing so than others. Also, people play different roles in our lives, and romantic partners, coworkers, friends, and teachers will typically only see us in limited, and perhaps unrepresentative, contexts. A good moralometer would require the correct balancing of a range of informants with complementary patches of ignorance, which is likely to be infeasible.

Informants are also likely to be biased.

Informant reports may be contaminated not by self-serving bias but by "pal-serving bias" (Leising et al 2010). If we rely on people to nominate their own informants, they are likely to nominate people who have a positive perception of them. Furthermore, the informants might be reluctant "tell on" or badly evaluate their friends, especially in contexts (like personnel selection) where the rating could have real consequences for the target. The ideal informant would be someone who knows the target well but isn't positively biased toward you. In reality, however, there's likely a tradeoff between knowledge and bias, so that those who are most likely to be impartial are not the people who know you best.

Positivity bias could in principle be corrected for if every informant was equally biased, but it's likely that some targets will have informants who are more biased than others.

Behavioral moralometers

Given the problems with self-report and informant report, direct behavioral measures might seem promising. Much of my own work on the morality of professional ethicists and the effectiveness of ethics instruction has depended on direct behavioral measures such as courteous and discourteous behavior at philosophy conferences, theft of library books, meat purchases on campus (after attending a class on the ethics of eating meat), charitable giving, and choosing to join the Nazi party in 1930s Germany. Others have measured behavior in dictator games, lying to the experimenter in laboratory settings, criminal behavior, and instances of comforting, helping, and sharing.

Individual behaviors are only a tiny and possibly unrepresentative sample.

Perhaps the biggest problem with behavioral moralometers is that any single, measurable behavior will inevitably be a minuscule fraction of the person's behavior, and might not be at all representative of the person's overall morality. The inference from this person donated $10 in this instance or this person committed petty larceny two years ago to this person's overall moral character is good or bad is a giant leap from a single observation. Given the general variability and inconstancy of most people's behavior, we shouldn't expect a single observation, or even a few related observations, to provide an accurate picture of the person overall.

Although self-report and informant report are likely to be biased, they aggregate many observations of the target into a summary measure, while the typical behavioral study does not.

There is likely a tradeoff between feasibility and validity.

There are some behaviors that are so telling of moral character that a single observation might reveal a lot: If someone commits murder for hire, we can be pretty sure they're no saint. If someone donates a kidney to a stranger, that too might be highly morally diagnostic. But such extreme behaviors will occur at only tiny rates in the general population. Other substantial immoral behaviors, such as underpaying taxes by thousands of dollars or cheating on one's spouse, might occur more commonly, but are likely to be undetectable to researchers (and perhaps unethical to even try to detect).

The most feasible measures are laboratory measures, such as misreporting the roll of a die to an experimenter in order to win a greater payout. But it's unclear what the relationship is between laboratory behaviors for minor stakes and overall moral behavior in the real world.

Individual behaviors can be difficult to interpret.

Another advantage of self-report and to some extent informant report have over direct behavioral measures is that there's an opportunity for contextual information to clarify the moral value or disvalue of behaviors: The morality of donating $10 or the immorality of not returning a library book might depend substantially on one's motives or financial situation, which self-report or informant report can potentially account for but which would be invisible in a simple behavioral measure. (Of course, on the flip side, this flexibility of interpretation is part of what permits bias to creep in.)

[a polygraph from 1937]

Physiological moralometers

A physiological moralometer would attempt to measure someone's morality by measuring something biological like their brain activity under certain conditions or their genetics. Given the current state of technology, no such moralometer is likely to arise soon. The best known candidate might be the polygraph or lie detector test, which is notoriously unreliable and of course doesn't purport to be a general measure of honesty much less of overall moral character.

Any genetic measure would of course omit any environmental influences on morality. Given the likelihood that environmental influences play a major role in people's moral development, no genetic measure could have a high correlation with a person's overall morality.

Brain measures, being potentially closer to measuring the mental states that underlie morality, don't have a similar ceiling accuracy, but currently look less promising than behavioral measures, informant report measures, and probably even self-report measures.

The Inaccuracy of All Methods

It thus seems likely that there is no good method for accurately measuring a person's overall moral character. Self-report, informant report, behavioral measures, and physiological measures all face large methodological difficulties. If a moralometer is something that accurately measures an individual person's morality, like a thermometer accurately (accurately enough) measures a person's body temperature, there's little reason to think we could build one.

It doesn't follow that we can't imprecisely measure someone's moral character. It's reasonable to expect the existence of small correlations between some potential measures and a person's real underlying overall moral character. And maybe such measures could be used to look for trends aggregated across groups.

Now, this whole post has been premised on the idea that it make sense to talk of a person's overall morality as something that could be captured, at least in principle, by a number such as 0 to 100 or -1 to +1. There are a few reasons to doubt this, including moral relativism and moral incommensurability -- but more on that in a future post.

Tuesday, December 13, 2022

An Objection to Chalmers's Fading Qualia Argument

Would a neuron-for-neuron silicon isomorph of you have conscious experiences? Or is there something special about the biology of neurons, so that no brain made of silicon, no matter how sophisticated and similar to yours, could actually have conscious experiences?

In his 1996 book and a related 1995 article, David Chalmers offers what he calls the "fading qualia" argument that there's nothing in principle special about neurons (see also Cuda 1985). The basic idea is that, in principle, scientists could swap your neurons out one by one, and you'd never notice the difference. But if your consciousness were to disappear during this process, you would notice the difference. Therefore, your consciousness would not disappear. A similar idea underlies Susan Schneider's "Chip Test" for silicon consciousness: To check whether some proposed cognitive substrate really supports consciousness, slowly swap out your neurons for that substrate, a piece at a time, checking for losses of consciousness along the way.

In a recent article critical of Schneider, David Udell and I have criticized her version of the swapping test. Our argument can be adapted to Chalmers's fading qualia argument, which is my project today.

First, a bit more on how the gradual replacement is supposed to work. Suppose you have a hundred billion neurons. Imagine replacing just one of those neurons with a silicon chip. The chemical and electrical signals that serve as inputs to that neuron are registered by detectors connected to the chip. The chip calculates the effects that those inputs would have had on the neuron's behavior -- specifically, what chemical and electrical signals the neuron, had it remained in place, would have given as outputs to other neurons connected to it -- and then delivers those same outputs to those same neurons by effectors attached to the silicon chip on one end and the target neurons at the other end. No doubt this would be complicated, expensive, and bulky; but all that matters to the thought experiment is that it would be possible in principle. A silicon chip could be made to perfectly imitate the behavior of a neuron, taking whatever inputs the neuron would take and converting them into whatever outputs the neuron would emit given those inputs. Given this perfect imitation, no other neurons in the brain would behave differently as a result of the swap: They would all be getting the same inputs from the silicon replacement that they would have received from the original neuron.

So far, we have replaced only a single neuron, and presumably nothing much has changed. Next, we swap another. Then another. Then another, until eventually all one hundred billion have been replaced, and your "neural" structure is now entirely constituted by silicon chips. (If glial cells matter to consciousness, we can extend the swapping process to them also.) The resulting entity will have a mind that is functionally identical to your own at the level of neural structure. This implies that it will have exactly the same behavioral reactions to any external stimuli that you would have. For example, if it is asked, "Are you conscious?" it will say, "Definitely, yes!" (or whatever you would have said), since all the efferent outputs to your muscles will be exactly the same as they would have been had your brain not been replaced. The question is whether the silicon-chipped entity might actually lack conscious experiences despite this behavioral similarity, that is, whether it might be a "zombie" that is behaviorally indistinguishable from you despite having nothing going on experientially inside.

Chalmers's argument is a reductio. Assume for the sake of the reductio that the final silicon-brained you entirely lacks conscious experience. If so, then sometime during the swapping procedure consciousness must either have gradually faded away or suddenly winked out. It's implausible, Chalmers suggests, that consciousness would suddenly wink out with the replacement of a single neuron. (I'm inclined to agree.) If so, then there must be intermediate versions of you with substantially faded consciousness. However, the entity will not report having faded consciousness. Since (ex hypothesi) the silicon chips are functionally identical with the neurons, all the intermediate versions of you will behave exactly the same as they would have behaved if no neurons had been replaced. Nor will there be other neural activity constitutive of believing that your consciousness is fading away: Your unreplaced neurons will keep firing as usual, as if there had been no replacement at all.

However, Chalmers argues, if your consciousness were fading away, you would notice it. It's implausible that the dramatic changes of consciousness that would have to be involved when your consciousness is fading away would go entirely undetected during the gradual replacement process. That would be a catastrophic failure of introspection, which is normally a reliable or even infallible process. Furthermore, it would be a catastrophic failure that occurs while the cognitive (neural/silicon) systems are functioning normally. This completes the reductio. Restated in modus tollens form: If consciousness would disappear during gradual replacement, you'd notice it; but you wouldn't notice it; therefore consciousness would not disappear during gradual replacement.

As Udell and I frame it in our discussion of Schneider, this argument has an audience problem. Its target audience is someone who is worried that despite in-principle functional identicality at the neuronal level, silicon might just not be the right kind of stuff to host consciousness. Someone who has this worry presumably does not trust the introspective reports, or the seemingly-introspective reports, of the silicon-brained entity. The silicon-brained entity might say "Yes, of course I'm conscious! I'm experiencing right now visual sensations of your face, auditory sensations of my voice, and a rising feeling of annoyance at your failure to believe me!" The intended audience remains unconvinced by this apparent introspective testimony. They need an argument to be convinced otherwise -- the Fading Qualia argument.

Let's call the entity (the person) before any replacement surgery r0, and the entity after all their neurons are replaced rn, where n is the total number of neurons replaced. During replacement, this entity passes through stages r1, r2, r3, ... ri, ... rn. By stipulation, our audience doesn't trust the introspective or seemingly introspective judgments of rn. This is the worry that motivates the need for the Fading Qualia argument. In order for the argument to work, there must be some advantage that the intermediate ri entities systematically possess over rn, such that we have reason to trust their introspective reports despite distrusting rn's report.

Seemingly introspective reports about conscious experience may or may not be trustworthy in the normal human case (Schwitzgebel 2011; Irvine 2013). But even if they're trustworthy in the normal human case, they might not be trustworthy in the unusual case of having pieces of one's brain swapped out. One might hold that introspective judgments are always trustworthy (absent a certain range of known defeaters, which we can stipulate are absent), in other words, that unless a process accurately represents a target conscious experience it is not a genuinely introspective process. This is true, for example on containment views of introspection, according to which properly formed introspective judgments contain the target experiences as a part (e.g., "I'm experiencing [this]"). Infallibilist views of introspection of that sort contrast with functionalist views of introspection, on which introspection is a fallible functional process that garners information about a distinct target mental state.

A skeptic about silicon consciousness might either accept or reject an infallibilist view of introspection. The Fading Qualia argument will face trouble either way.

[A Trilemma for the Fading Qualia Argument (click to enlarge and clarify figure): Optimists about silicon chip consciousness have no need for an argument in favor of rn consciousness, because they are already convinced of its possibility. On the other hand, skeptics about silicon consciousness are led to doubt either the presence or the reliability of ri's introspection (depending on their view of introspection) for the same reason they are led to doubt rn's consciousness in the first place.]

If a silicon chip skeptic holds that genuine introspection requires and thus implies genuine consciousness, then they will want to say that a "zombie" rn, despite emitting what looks from the outside like an introspective report of conscious experience, does not in fact genuinely introspect. With no genuine conscious experience for introspection to target, the report must issue, on this view, from some non-introspective process. This raises the natural question of why they should feel confident that the intermediate ris are genuinely introspecting, instead of merely engaging in a non-introspective process similar to rn's. After all, there is substantial architectural similarity between rn at at least the late-stage ris. The skeptic needs, but Chalmers does not provide, some principled reason to think that entities in the ri phases would in fact introspect despite rn's possible failure to do so -- or at least good reason to believe that the ris would successfully introspect their fading consciousness during the most crucial stages of fade-out. Absent this, reasonable doubt about rn introspection naturally extends into reasonable doubt about introspection in the ri cases as well. The infallibilist skeptic about silicon-based consciousness needs their skepticism about introspection to be assuaged for at least those critical transition points before they can accept the Fading Qualia argument as informative about rn's consciousness.

If a skeptic about silicon-based consciousness believes that genuine introspection can occur without delivering accurate judgments about consciousness, analogous difficulties arise. Either rn does not successfully introspect, merely seeming to do so, in which case the argument of the previous paragraph applies, or rn does introspect and concludes that consciousness has not disappeared or changed in any radical way. The functionalist or fallibilist skeptic about silicon-based consciousness does not trust that rn has introspected accurately. On their view, rn might in fact be a zombie, despite introspectively-based claims otherwise. Absent any reason for the fallibilist skeptic about silicon-based consciousness to trust rn's introspective judgments, why should they trust the judgments of the ris -- especially the late-stage ris? If rn can mistakenly judge itself conscious, on the basis of its introspection, might someone undergoing the gradual replacement procedure also erroneously judge its consciousness not to be fading away? Gradualness is no assurance against error. Indeed, error is sometimes easier if we (or "we") slowly slide into it.

This concern might be mitigated if loss of consciousness is sure to occur early in the replacement process, when the entity is much closer to r0 than rn, but I see no good reason to make that assumption. And even if we were to assume that phenomenal alterations would occur early in the replacement process, it's not clear why the fallibilist should regard those changes as the sort that introspection would likely detect rather than miss.

The Fading Qualia argument awkwardly pairs skepticism about rn's introspective judgments with unexplained confidence in the ri's introspective judgments, and this pairing isn't theoretically stable on any view of introspection.

The objection can be made vivid with a toy case: Suppose that we have an introspection module in the brain. When the module is involved in introspecting a conscious mental state, it will send query signals to other regions of the brain. Getting the right signals back from those other regions -- call them regions A, B, and C -- is part of the process driving the judgment that experiential changes are present or absent. Now suppose that all the neurons in region B have been replaced with silicon chips. Silicon region B will receive input signals from other regions of the brain, just as neural region B would have, and silicon region B will then send output signals to other brain regions that normally interface with neural region B. Among those output signals will be signals to the introspection module.

When the introspection module sends its query signal to region B, what signal will it receive in return? Ex hypothesi, the silicon chips perfectly functionally emulate the full range of neural processes of the neurons they have replaced; that's just the set-up of the Fading Qualia argument. Given this, the introspection module would of course receive exactly the same signal it would have received from region B had region B not been replaced. If so, then entity ri will presumably infer that activity in region B is conscious. Maybe region B normally hosts conscious experiences of thirst. The entity might then say to itself (or aloud), "Yes, I'm still feeling thirsty. I really am having that conscious experience, just as vividly, with no fading, despite the replacement of that region of my brain by silicon chips." This would be, as far as the entity could tell, a careful and accurate first-person introspective judgment.

(If, on the other hand, the brain region containing the introspection module is the region being replaced, then maybe introspection isn't occurring at all -- at least in any sense of introspection that is committed to the idea that introspection is a conscious process.)

A silicon-chip consciousness optimist who does not share the skeptical worries that motivate the need for the Fading Qualia argument might be satisfied with that demonstration. But the motivating concern, the reason we need the argument, is that some people doubt that silicon chips could host consciousness even if they can behave functionally identically with neurons. Those theorists, the target audience of the Fading Qualia argument, should remain doubtful. They ought to worry that the silicon chips replacing brain region B don't genuinely host consciousness, despite feeding output to the introspection module that leads ri to conclude that consciousness has not faded at all. They ought to worry, in other words, that the introspective process has gone awry. This needn't be a matter of "sham" chips intentionally designed to fool users. It seems to be just a straightforward engineering consequence of designing chips to exactly mimic the inputs and outputs of neurons.

This story relies on a cartoon model of introspection that is unlikely to closely resemble the process of introspection as it actually occurs. However, the present argument doesn't require the existence of an actual introspection module or query process much like the toy case above. An analogous story holds for more complex and realistic models. If silicon chips functionally emulate neurons, there is good reason for someone with the types of skeptical worries about silicon-based consciousness that the Fading Qualia argument is designed to address to similarly worry that replacing neurons with functionally perfect silicon substitutes would either create inaccuracies of introspection or replace the introspective process with whatever non-introspective process even zombies engage in.

The Fading Qualia argument thus, seemingly implausibly, combines distrust of the putative introspective judgments of rn with credulousness about the putative introspective judgments of the series of ris between r0 and rn. An adequate defense of the Fading Qualia argument will require careful justification of why someone skeptical about the seemingly introspective judgments of an entity whose brain is entirely silicon should not be similarly skeptical about similar seemingly introspective judgments that occur throughout the gradual replacement process. As it stands, the argument lacks the necessary resources legitimately to assuage the doubts of those who enter it uncertain about whether consciousness would be present in a neuron-for-neuron silicon isomorph.



"Chalmers's Fading/Dancing Qualia and Self-Knowledge" (Apr 22, 2010)

"How to Accidentally Become a Zombie Robot" (Jun 23, 2016)

Much of the text above is adapted with revisions from:

"Susan Schneider's Proposed Tests for AI Consciousness: Promising but Flawed" (with David Billy Udell), Journal of Consciousness Studies, 28 (5-6), 121-144.

Wednesday, December 07, 2022

An Accurate Moralometer Would Be So Useful... but Also Horrible?

Imagine, if you can, an accurate moralometer -- an inexpensive device you could point at someone to get an accurate reading of their overall moral goodness or badness. Point it at Hitler and see it plunge down into the deep red of evil. Point it at your favorite saint and see it rise up to the bright green of near perfection. Would this be a good thing or a bad thing to have?

Now maybe you can't imagine an accurate moralometer. Maybe it's just too far from being scientifically feasible -- more on this in an upcoming post. Or maybe, more fundamentally, morality just isn't the kind of thing can can be reduced to scalar values of say +0.3 on a spectrum from -1 to +1. Probably that issue deserves a post also. But let's set qualms aside for the sake of this thought experiment. $49.95 buys you a radar-gun-like device that instantly measures the overall moral goodness of anyone you point it at, guaranteed.

[a "moralometer" given to me for my birthday a couple of years ago by my then thirteen-year-old daughter]

Imagine the scientific uses!

Suppose we're interested in moral education: What interventions actually improve the moral character of the people they target? K-12 "moral education" programs? Reading the Bible? Volunteering at a soup kitchen? Studying moral philosophy? Buddhist meditation? Vividly imagining yourself in others' shoes? Strengthening one's social networks? Instantly, our moralometer gives us the perfect dependent measure. We can look at both the short-term and long-term effects of various interventions. Effective ones can be discovered and fine-tuned, ineffective ones unmasked and discarded.

Or suppose we're interested in whether morally good people tend to be happier than others. Simply look for correlations between our best measures of happiness and the outputs of our moralometer. We can investigate causal relationships too: Conduct a randomized controlled study of interventions on moral character (by one of the methods discovered to be effective), and see if the moral treatment group ends up happier than the controls.

Or suppose we're interested in seeing whether morally good people make better business leaders, or better kindergarten teachers, or better Starbucks cashiers, or better civil engineers. Simply look for correlations between moralometer outputs and performance measures. Voila!

You might even wonder how could we even pretend to study morality without some sort of moralometer, of at least a crude sort. Wouldn't that be like trying to study temperature without a thermometer? It's hard to see how one could make any but the crudest progress. (In a later post, I'll argue that this is in fact our situation.)

Imagine, too, the practical uses!

Hiring a new faculty member in your department? Take a moralometer reading beforehand, to ensure you aren't hiring a monster. Thinking about who to support for President? Consider their moralometer reading first. (Maybe Hitler wouldn't have won 37% of the German vote in 1932 if his moralometer reading had been public?) Before taking those wedding vows... bring out the moralometer! Actually, you might as well use it on the first date.

But... does this give you the creeps the way it gives me the creeps?

(It doesn't give everyone the creeps: Some people I've discussed this with think that an accurate moralometer would simply be an unqualified good.)

If it gives you the creeps because you think that some people would be inaccurately classified as immoral despite being moral -- well, that's certainly understandable, but that's not the thought experiment as I intend it. Postulate a perfect moralometer. No one's morality will be underestimated. No one's morality will be overestimated. We'll all just know, cheaply and easily, who are the saints, and who are the devils, and where everyone else is situated throughout the mediocre middle. It will simply make your overall moral character as publicly observable as your height or skin tone (actually a bit more so, to the extent height and skin tone can be to some extent fudged with shoe inserts and makeup). Although your moral character might not be your best attribute -- well, we're judged by height and race too, and probably less fairly, since presumably height and race are less under our control than our character is.

If you share with me the sense that there would be something, well, dystopian about a proliferation of moralometers -- why? I can't quite put my finger on it.

Maybe it's privacy? Maybe our moral character is nobody's business.

I suspect there's something to this, but it's not entirely obvious how or why. If moral character is mostly about how you generally treat people in the world around you... well, that seems like that very much is other people's business. If moral character is about how you would hypothetically act in various situations, a case could be made that even those hypotheticals are other people's business: The hiring department, the future spouse, etc., might reasonably want to know whether you're in general the type of person who would, when the opportunity arises, lie and cheat, exploit others, shirk, take unfair advantage.

It's reasonable to think that some aspects of your moral character might be private. Maybe it's none of my colleagues' business how ethical I am in my duties as a father. But the moralometer wouldn't reveal such specifics. It would just give a single general reading, without embarrassing detail, masking personal specifics behind the simplicity of a scalar number.

Maybe the issue is fairness? If accurate moralometers were prevalent, maybe people low on the moral scale would have trouble finding jobs and romantic partners. Maybe they'd be awarded harsher sentences for committing the same crimes as others of more middling moral status. Maybe they'd be shamed at parties, on social media, in public gatherings -- forced to confess their wrongs, made to promise penance and improvement?

I suspect there's something to this, too. But I hesitate for two reasons. One is that it's not clear that widespread blame and disadvantage would dog the morally below average. Suppose moral character were found to be poorly correlated with, or even inversely correlated with, business success, or success in sports, or creative talent. I could then easily imagine low to middling morality not being a stigma -- maybe even in some circles a badge of honor. Maybe it's the prudish, the self-righteous, the precious, the smug, the sanctimonious who value morality so much. Most of us might rather laugh with the sinners than cry with the saints.

Another hesitation about the seeming unfairness of widespread moralometers is this: Although it's presumably unfair to judge people negatively for their height or their race, which they can't control and which don't directly reflect anything blameworthy, one's moral character, of course, is a proper target of praise and blame and is arguably at least partly within our control. We can try to be better, and sometimes we succeed. Arguably, the very act of sincerely trying to be morally better already by itself constitutes a type of moral improvement. Furthermore, in the world we're imagining, per the scientific reflections above, there will presumably be known effective means for self-improvement for those who genuinely seek improvement. Thus, if a moralometer-heavy society judges someone negatively for having bad moral character, maybe there's no unfairness in that at all. Maybe, on the contrary, it's the very paradigm of a fair judgment.

Nonetheless, I don't think I'd want to live in a society full of moralometers. But why, exactly? Privacy and fairness might have something to do with it, but if so, the arguments still need some work. Maybe it's something else?

Or maybe it's just imaginative resistance on my part -- maybe I can't really shed the idea that there couldn't be a perfect moralometer and so necessarily any purported moralometer will lead to unfairly mistaken judgments? But even if we assume some inaccuracy, all of our judgments about people are to some extent inaccurate. Suppose we could increase the overall accuracy of our moral assessments at the cost of introducing a variable that reduces moral complexity to a single, admittedly imperfect number (like passer rating in American football). Is morality so delicate, so touchy, so electric that it would be better to retain our current very inaccurate means of assessment than to add to our toolkit an imperfect but scientifically grounded scalar measure?

Friday, December 02, 2022

A Chatbot's Take on "Moralometers"

I still have COVID, and I had a book manuscript deadline Wednesday. Things got out of hand, and I wasn't able to write up the blog post I wanted to write this week. Boo!

But as regular readers will know, the chatbot GPT-3 can write Schwitzgebelian blog posts, and -- after fine-tuning on the corpus of Dan Dennett -- can even answer philosophical questions in Dennett's style, which experts on Dennett's work have difficulty discriminating from Dennett's own answers.

On Monday, Open-AI released an update of their GPT-3 model, Davinci-003. Since I'd been planning to write a blog post on the idea of a "moralometer" -- a hypothetical device that would accurately measure people's moral character. So I prompted GPT-3's new davinci003 as follows:

Write a blog post in the style of Eric Schwitzgebel, about 1000 words long, about whether it would be possible to design a moralometer, which accurately measures people's moral character, and what the advantages or disadvantages of having a moralometer would be.

Its reply:

The Moralometer: A Tool for Measuring Moral Character?

In recent years, the idea of a “moralometer” has been gaining traction, as a way of measuring a person’s moral character. The concept is simple: a device that can measure a person’s morality, judged against an agreed-upon ethical framework. The moralometer could be used to determine whether a person is moral or immoral, and to what degree either of these traits are present. It’s an intriguing idea, but can a moralometer truly be built, and what would be the pros and cons of such a device?

The first question to address is whether a moralometer is feasible. On the surface, the idea of a device that can measure morality seems far-fetched. After all, morality is a highly subjective concept, and what is considered moral or immoral can vary greatly from person to person, and even from culture to culture. In addition, moral judgments are so often based on context, which makes it difficult to design a device that would accurately measure morality in all scenarios.

However, it may be possible to create a moralometer, with the right approach. The key would be to develop a device that could measure a person’s moral values and tendencies, as opposed to their specific moral judgments. By doing this, it would be possible to create a device that can accurately measure a person’s overall moral character, without being weighed down by the complexities of individual moral judgments.

The second question is whether a moralometer would be beneficial. On the one hand, a moralometer could be used to identify immoral behavior and help guide people towards more ethical behavior. It could also be used to help individuals make moral decisions, by providing an objective measure of the morality of their choices. On the other hand, there are some serious potential issues with a moralometer. For one, it could be used to unfairly judge people, as those with lower moral scores could be unfairly labeled as immoral. In addition, a moralometer could be used to enforce moral conformity, which could stifle individual autonomy and creativity.

Overall, there are both potential advantages and disadvantages to a moralometer. While it could be used to help identify and address immoral behavior, it also carries the risk of being used to unfairly judge people, or to enforce moral conformity. Ultimately, the decision of whether or not to create a moralometer will be up to society, and the implications of such a device should be carefully considered before any action is taken.

Next week, wind and meteors and immune system willing, I will try my own hand at the topic. The pressure is on! Will I be able to do a better job than davinci003? Stay tuned!

Relatedly, here's Dall-E's interpretation of a "realistic AI image of a meter for measuring morality":

Update December 7:
If you want, rate the chatbot post above and my own post on a postometer from 0 to 10.