Friday, February 24, 2023

Moral Mediocrity, Apologizing for Vegetarianism, and Do-Gooder Derogation

Though I'm not a vegetarian, one of my research interests is the moral psychology of vegetarianism. Last weekend, when I was in Princeton giving a talk on robot rights, a vegetarian apologized to me for being vegetarian.

As a meat-eater, I find it's not unusual for vegetarians to apologize to me. Maybe this wouldn't be so notable if their vegetarianism inconvenienced me in any way, but often it does not. In Princeton, we were both in line for a catering spread that had both meat and vegetarian options. I was in no obvious way wronged, harmed, or inconvenienced. So what is going on?

Here's my theory.

Generally speaking, I believe that people aim to be morally mediocre. That is, rather than aiming to be morally good (or not morally bad) by absolute standards, most people aim to be about as morally good as their peers -- not especially better, not especially worse. People might not conceptualize themselves as aiming for mediocrity. Often, they concoct post-hoc rationalizations to justify their choices. But their choices implicitly reveal their moral target. Systematically, people avoid being among the worst of their peers while refusing the pay the costs of being among the best. For example, they don't want to be the one jerk who messes up a clean environment; but they also don't want to be the one sucker who puts in the effort to keep things clean if others aren't also doing so. (See my notes on the game of jerk and sucker.)

Now if people do in fact aim to be about as morally good as their peers, we can expect that under certain conditions they don't want their peers to improve their moral behavior. Under what conditions? Under the conditions that your peers' self-improvement benefits you less than the raising of the moral bar costs you.

Let's say that your friends all become nicer to each other. This isn't so bad. You benefit from being in a circle of nice people. Needing to become a bit nicer yourself might be a reasonable cost to pay for that benefit. 

But if your friends start becoming vegetarians, you accrue the moral costs without the benefits. The moral bar is raised for you, implicitly, at least a little bit; but the benefits go to non-human animals, if they go anywhere. You now either have to think a bit worse of yourself relative to your peers or you have to start changing your behavior. How annoying! No wonder vegetarians are moved to apologize. (To be clear, I'm not saying we should be annoyed by this, just that my theory predicts that we will be annoyed.)

Note that this explanation works especially well for those of us who think it is morally better to avoid eating meat than for those of us who see no moral difference between eating meat and eating vegetarian. If you really see no moral difference (deep down, and not just because of superficial, post-hoc rationalization), then you'll see the morally motivated vegetarian just as morally confused. If they apologize, it would be like someone apologizing to you for acting according to some other mistaken moral principle, such as apologizing for abstinence before marriage. No one needs to apologize to you for that, unless they are harming or inconveniencing you in some way -- for example, because they are dating you and think you'll be disappointed. (Alternatively, they might apologize for the more abstract wrong of seeing you as morally deficient because you follow different principles; but that type of apology looks and feels a little different, I think.)

If this moral mediocrity explanation of vegetarian apology works, it ought to generalize to other cases where friends follow higher moral standards that don't benefit you. Some possible examples: In a circle of high school students who habitually cheat on tests, a friend might apologize for being unwilling to cheat. In a group of people who feel somewhat guilty about taking a short cut through manicured grass, one might decide they want to take the long way, apologizing to the group for the extra time, feeling more guilt than would accompany an ethically neutral reason for delay. On this model, the felt need for the apology would vary with a few predictable parameters: greater need the closer one is to being a peer whose behavior might be compared, greater need the more vivid and compelling the comparison (for example if you are side by side), lesser need the more the moral principle can be seen as idiosyncratic and inapplicable to the other (and thus some apologies of this sort suggest that the principle is idiosyncratic).

Do-gooder derogation is the tendency for people to think badly of people who follow more demanding moral standards. The moral mediocrity hypothesis is one possible explanation for this tendency, predicting among other things that derogation will be greater when the do-gooder is a peer and, perhaps unintuitively, that the derogation will be greater when the moral standard is compelling enough to the derogator that they already feel a little bit bad about not adhering to it.

-------------------------------------------

Related:

The Collusion Toward Moral Mediocrity (Sep 1, 2022)

Aiming for Moral Mediocrity (Res Philosophica, 2019)

Image: Dall-E 2 "oil painting of a woman apologizing to an eggplant"

Thursday, February 16, 2023

U.S. Philosophy PhDs Are Still Overwhelmingly Non-Hispanic White (Though a Bit Less So Than 10 Years Ago)

Nine years ago, I compared the racial and ethnic composition of U.S. academic philosophy, as measured by PhDs awarded, with that of the other humanities. I found -- no surprise -- that a large majority of Philosophy PhD recipients were non-Hispanic White. I also found, somewhat more to my surprise, that this did not make it unusual among the humanities. Digging into the details suggested an explanation: Many of the subfields of the humanities, e.g., German literature and European history, specialize in the European tradition. Such subfields were typically as predominantly White as philosophy or even more so. Subfields of the humanities specializing in non-European traditions, e.g., Asian history, tended to be not nearly as White, with substantial proportions of PhD recipients identifying with the racial or ethnic category associated with the region.

At the time, I suggested the following hypothesis: Philosophy might be overwhelmingly White because students tend to perceive it as something like an area studies or cultural studies discipline focusing on the European (and White North American) tradition. (See Bryan Van Norden and Jay Garfield for an articulation and critique of this way of seeing academic philosophy as practiced in the U.S.).

Nine years later, I find myself wondering to what extent the pattern still holds. Time for an update!

------------------------------------------

Before presenting the results, two nerdy methodological notes (feel free to skip).

Methodological note on ethnic and racial categories and non-response rates: These analyses rely on the National Science Foundation's Survey of Earned Doctorates. The SED aims to collect data on all PhDs awarded in accredited U.S. universities, and typically reports response rates over 90%. The most recent available year is 2021 (response rate 92%). Data are based on self-report of ethnicity and race. The top-level category split is temporary visa holders vs. U.S. citizens and permanent residents. U.S. citizens and permanent residents are divided into Hispanic or Latino, not Hispanic or Latino, or ethnicity not reported. Respondents who identify as not Hispanic or Latino are then divided into the racial categories American Indian or Alaska Native, Asian, Black or African American, White, More than one race, or Other race or race not reported. The analyses below exclude temporary visa holders and respondents who did not report their ethnicity or race or reported "other".  In Philosophy, 76% of respondents indicated that they were U.S. citizens or permanent residents (18% indicated that they were temporary visa holders, and 6% presumably did not answer the question), and among the U.S. citizens and permanent residents, 5% either reported "other" or did not report their ethnicity or race.

Methodological note on disciplinary classification as "Philosophy": Before 2021, the SED had a two philosophy-relevant subfields, "philosophy" and "ethics", which were generally merged in public data presentation. (In a custom analysis I requested several years ago, I found that "ethics" was only a small number of doctorates.) Starting in 2021, there are three philosophy-relevant subfields: "History/philosophy of science, technology and society" (68 PhDs awarded), "Philosophy" (399 PhDs awarded), and "Philosophy and religious studies not elsewhere classified" (degrees classified as broadly within the field of philosophy and religious studies but not designated specifically as philosophy or specifically as religious studies; 67 PhDs awarded). "Ethics" no longer appears to be a category. My analysis will focus only on the "Philosophy" group. For comparison, in 2020, 460 PhDs were awarded in "Philosophy" or "Ethics", and in 2019, 474 PhDs were awarded in "Philosophy" or "Ethics". It is likely that most of the degrees that would have been classified in 2020 as "Philosophy" or "Ethics" are classified in 2021 as "Philosophy". However, since it's unlikely that the number of philosophy degrees awarded declined by 13% between the two years (from 460 to 399), it is likely that a small but non-trivial percentage of degrees that would have been classified as "Philosophy" or "Ethics" in 2020 are now classified as "History/philosophy of science, technology and society" or as "Philosophy and religious studies not elsewhere classified". In short, the 2021 "Philosophy" degree category is probably largely comparable but not exactly comparable with the earlier "Philosophy" and "Ethics" degree categories.

------------------------------------------

Philosophy, 2021 PhDs (290 included respondents):

  • Hispanic or Latino (any race): 9.0%
  • Not Hispanic or Latino:
    • American Indian or Alaska Native: 0.0%
    • Asian: 4.1%
    • Black or African American: 2.8%
    • White: 81.0%
    • More than one race: 3.1%
For comparison, among all PhD recipients (30,830 included respondents):

  • Hispanic or Latino (any race): 9.3%
  • Not Hispanic or Latino:
    • American Indian or Alaska Native: 0.3%
    • Asian: 9.8%
    • Black or African American: 7.9%
    • White: 69.1%
    • More than one race: 3.5%

Philosophy PhD recipients approximately match PhD recipients overall in percentage Hispanic or Latino.  Among respondents who are not Hispanic or Latino, Philosophy PhD recipients approximately match PhD recipients overall in percentage who report being more than one race, but compared with PhD recipients overall, Philosophy PhD recipients are substantially less Asian, Black, and (perhaps, though for numbers this small, chance fluctuations can't be ruled out) American Indian or Alaska Native.  Finally -- as these other numbers imply -- philosophy is disproportionately White.

Rewinding 10 years to look at the "Philosophy" and "Ethics" combined category from 2011 (367 included respondents):

  • Hispanic or Latino (any race): 4.9%
  • Not Hispanic or Latino:
    • American Indian or Alaska Native: 0.0%
    • Asian: 3.8%
    • Black or African American: 2.7%
    • White: 87.2%
    • More than one race: 1.3%
Here we can see the tendency, as I've noted before, toward increasing percentages of Asian, Hispanic/Latino, and multi-racial philosophy PhD recipients, while the numbers of American Indian/Alaska Native and Black/African American philosophy PhD recipients remains disproportionately low, with little to no increase.

How about field by field? Among the 300 "detailed" fields of study -- NSF's finest-grain division -- Philosophy is the 40th Whitest (by percentage non-Hispanic White). NSF no longer includes categories for French & Italian or German literature, which used to be very White area studies categories, but several European / North American area studies categories remain in the new classification. All are at least as non-Hispanic White as Philosophy. Specifically:
  • European history (89.7% non-Hispanic White) [in 2011: 92.7%]
  • Classical and ancient studies (88.4%) [in 2011: 92.6%]
  • American history (U.S.) (86.3%) [in 2011: 81.5%]
  • American literature (U.S.) (85.3%) [in 2011: 82.6%]
  • English literature (Britain and commonwealth) (81.6%) [87.9%]
Note than in the humanities "classical" and "ancient" typically refer to ancient Greek and Roman culture and not, for example, ancient China, India, Africa, or the Americas.

Note also: Of course, European history and literature and U.S. history and literature are not exclusively White! However, as with Philosophy, the contributions of people we would now racialize as White tend to be centered.

Other PhD subfields with comparable or higher percentages of non-Hispanic White PhD recipients include music theory and education, meteorology/ecology/geology, animal sciences, and astronomy/astrophysics. Possibly, music theory and music education as typically taught in U.S. PhD programs tend to emphasize the White European and White North American traditions.

If we look at the humanities and social sciences more generally, they tend to be more ethnically and racially diverse than philosophy and the European area studies programs. For example, the social sciences overall are 66.7% non-Hispanic White; foreign languages, literatures, and linguistics overall is 61.3% non-Hispanic White; and general history (without a regional focus) is 71.2% White. The humanities overall is 76.3% non-Hispanic White, but of course that includes substantial numbers focusing in area studies or philosophy.

------------------------------------------

I draw two conclusions:

First, the pipeline of PhDs into philosophy in the U.S. remains over 80% non-Hispanic White, despite recent gains in the percentage of Asian, Hispanic/Latino, and multi-racial philosophy PhD recipients.

Second, the moderate increase in ethnic/racial diversity in PhDs -- from 87.2% non-Hispanic White in 2011 to 81.0% in 2021 -- is not part of a general trend toward increasing diversity in European and North America focused "area studies" PhDs, which generally remain about 80-90% non-Hispanic White.

These two observations are consistent with the view that academic philosophy is to some extent, but perhaps to a decreasing extent, still experienced by students as an area studies program focused on a certain aspect of European and North American culture or literature. I wouldn't lean too hard into that possible explanation, though. Probably at least a half-dozen other plausible hypotheses could be constructed to fit the data, and there are some non-area-studies fields, like meteorology/ecology/geology, that are even more proportionately White that Philosophy, for reasons I cannot guess.



Friday, February 10, 2023

How Not to Calculate Utilities in an Infinite Universe

Everything you do causes almost everything -- or so I have argued (blog post version here, more detailed and careful version collaborative with Jacob Barandes in my forthcoming book).  On some plausible cosmological assumptions, each of your actions ripples unendingly through the cosmos (including post-heat-death), causing infinitely many good and bad effects.

Assume that our actions do have infinitely many good and bad effects.  My thought today is that this would appear to ruin some standard approaches to action evaluation.  According to some vanilla versions of consequentialist ethics and ordinary decision theory, the goodness or badness of your actions depends on their total long-term consequences.  But since almost all of your actions have infinitely many good consequences and infinitely many bad consequences, the sum total value of almost all of your actions will be ∞ + -∞, a sum which is normally considered to be mathematically undefined.

Suppose you are considering two possible actions with short-term expected values m and n.  Suppose, further, that m is intuitively much larger than n.  Maybe Action 1, with short-term expected value m, is donating a large some of money to a worthwhile charity, while Action 2, with short-term expected value n, is setting fire to that money to burn down the house of a neighbor with an annoying dog.  Infinitude breaks the mathematical apparatus for comparing the long-term total value of those actions: The total expected value of Action 1 will be m + ∞ + -∞, while the total expected value of Action 2 will be n + ∞ + -∞.  Both values are undefined.

Can we wiggle out of this?  An Optimist might try to escape thus: Suppose that overall in the universe, at large enough spatiotemporal scales, the good outweighs the bad.  We can now consider the relative values of Action 1 and Action 2 by dividing them into three components: the short-term effects (m and n, respectively), the medium-term effects k -- the effects through, say, the heat death of our region of the universe -- and the infinitary effects (∞, by stipulation).  Stipulate that k is unknown but expected to be finite and similar for Actions 1 and 2.  The expected value of Action 1 is thus m + k + ∞.  The expected value of Action 2 is n + ∞.  These values are not undefined; so that particular problem is avoided.  The values are, however, equal: simple positive infinitude in both cases.  As the saying goes, infinity plus one just equals infinity.  A parallel Pessimistic solution -- assuming that at large enough time scales the bad outweighs the good -- runs into the same problem, only with negative infinitude.

Perhaps a solution is available for someone who holds that at large enough time scales the good will exactly balance the bad, so that we can compare m + k + 0 to n + k + 0?  We might call this the Knife's Edge solution.  The problem with the Knife's Edge solution is delivering that zero.  Even if we assume that the expected value of any spatiotemporal region is exactly zero, the Law of Large Numbers only establishes that as the size of the region under consideration goes to infinity, the average value is very likely to be near zero.  The sum, however, will presumably be divergent – that is, will not converge upon a single value.  If good and bad effects are randomly distributed and do not systematically decrease in absolute value over time, then the relevant series would be a + b + c + d + ... where each variable can take a different positive or negative value and where this is no finite limit to the value of positive or negative runs within the series -- seemingly the very archetype of a poorly behaved divergent series whose sum cannot be calculated (even by clever tools like Cesaro summation).  Thus, mathematically definable sums still elude us.  (Dominance reasoning also probably fails, since Actions 1 and 2 will have different rather than identical infinite effects.)

This generates a dilemma for believers in infinite causation, if they hope to evaluate actions by their total expected value.  Either accept the conclusion that there is no difference in total expected value between donating to charity and burning down your neighbor's house (the Optimist's or Pessimist's solution), or accept that there is no mathematically definable total expected value for any action, rendering proper evaluation impossible.

The solution, I suggest, is to reject certain standard approaches to action evaluation.  We should not to evaluate actions based on their total expected value over the lifetime of the cosmos!  We must have some sort of discounting with spatiotemporal distance, or some limitation of the range of consequences we are willing to consider, or some other policy to expunge the infinitudes from our equations.  Unfortunately, as Bostrom (2011) persuasively argues, no such solution is likely to be entirely elegant and intuitive from a formal point of view.  (So much the worse, perhaps, for elegance and intuition?)

The infinite expectation problem is robust in two ways.

First, it affects not only simple consequentialists.  After all, you needn't be a simple consequentialist to think that long-term expected outcomes matter.  Virtually everyone think that long-term expected outcomes matter somewhat.  As long as they matter enough that an infinitely positive long-term outcome, over the course of the entire history of the universe, would be relevant to your evaluation of an action, you risk being caught by this problem.

Second, the problem affects even people who think that infinite causation is unlikely.  Even if you are 99.99% certain that infinite causation doesn't occur, your remaining 0.01% credence in infinite causation will destroy your expected value calculations if you don't do something to sequester the infinitudes.  Suppose you're 99.99% sure that your action will have the value k, while allowing 1 0.01% chance that it's value will be ∞ + -∞.  If you now apply the expected value formula in the standard way, you will crash straightaway into the problem.  After all, .9999 * k + .0001 * (∞ + -∞) is just as undefined as ∞ + -∞ itself.  Similarly, .9999 * k + ∞ is simply ∞.  As soon as you let those infinitudes influence your decision, you fall back into the dilemma.

Thursday, February 02, 2023

Larva Pupa Imago

Yesterday, my favorite SF magazine, Clarkesworld, published another story of mine: "Larva Pupa Imago".

"Larva Pupa Imago" follows the life-cycle of a butterfly with human-like intelligence, from larva through mating journey.  This species of butterfly blurs the boundaries between self and other by swapping "cognitive fluids".  And of course I couldn't resist a reference to Zhuangzi.

Friday, January 27, 2023

Hedonic Offsetting for Harms to Artificial Intelligence?

Suppose that we someday create artificially intelligent systems (AIs) who are capable of genuine consciousness, real joy and real suffering.  Yes, I admit, I spend a lot of time thinking about this seemingly science-fictional possibility.  But it might be closer than most of us think; and if so, the consequences are potentially huge.  Who better to think about it in advance than we lovers of consciousness science, moral psychology, and science fiction?


Among the potentially huge consequences is the existence of vast numbers of genuinely suffering AI systems that we treat as disposable property.  We might regularly wrong or harm such systems, either thoughtlessly or intentionally in service of our goals.  

Can we avoid the morally bad consequences of harming future conscious AI systems by hedonic offsetting?  I can't recall the origins of this idea, and a Google search turns up zero hits for the phrase.  I welcome pointers so I can give credit where credit is due.  [ETA: It was probably Francois Kammerer who suggested it to me, in discussion after one of my talks on robot rights.]


[Dall-E image of an "ecstatic robot"]

Hedonic Offsetting: Simple Version

The analogy here is carbon offsetting.  Suppose you want to fly to Europe, but you feel guilty about the carbon emissions that would be involved.  You can assuage your guilty by paying a corporation to plant trees or distribute efficient cooking stoves to low-income families.  In total your flight plus the offset will be carbon neutral or even carbon negative.  In sum, you will not have contributed to climate change.

So now similarly imagine that you want to create a genuinely conscious AI system that you plan to harm.  To keep it simple, suppose it has humanlike cognition and humanlike sentience ("human-grade AI").  Maybe you want it to perform a task but you can't afford its upkeep in perpetuity, so you will delete (i.e., kill) it after the task is completed.  Or maybe you want to expose it to risk or hazard that you would not expose a human being to.  Or maybe you want it to do tasks that it will find boring or unpleasant -- for example, if you need it to learn some material, and punishment-based learning proves for some reason to be more effective than reward-based learning.  Imagine, further, that we can quantify this harm: You plan to harm the system by X amount.

Hedonic offsetting is the idea that you can offset this harm by giving that same AI system (or maybe a different AI system?) at least X amount of benefit in the form of hedonic goods, that is, pleasure.  (An alternative approach to offsetting might include non-hedonic goods, like existence itself or flourishing.)  In sum, you will not overall have harmed the AI system more than you benefited it; and consequently, the reasoning goes, you will not have overall committed any moral wrong.  The basic thought is then this: Although we might create future AI systems that are capable of real suffering and whom we should, therefore, treat well, we can satisfy all our moral obligations to them simply by giving them enough pleasure to offset whatever harms we inflict.

The Child-Rearing Objection

The odiousness of simple hedonic offsetting as an approach to AI ethics can be seen by comparing to human cases.  (My argument here resembles Mara Garza's and my response to the Objection from Existential Debt in our Defense of the Rights of Artificial Intelligences.)

Normally, in dealing with people, we can't justify harming them by appeal to offsetting.  If I steal $1000 from a colleague or punch her in the nose, I can't justify that by pointing out that previously I supported a large pay increase for her, which she would not have received without my support, or that in the past I've done many good things for her which in sum amount to more good than a punch in the nose is bad.  Maybe retrospectively I can compensate her by returning the $1000 or giving her something good that she thinks would be worth getting punched in the nose for.  But such restitution doesn't erase the fact that I wronged her by the theft or the punch.

Furthermore, in the case of human-grade AI, we normally will have brought it into existence and be directly responsible for its happy or unhappy state.  The ethical situation thus in important respects resembles the situation of bringing a child into the world, with all the responsibilities that entails.

Suppose that Ana and Vijay decide to have a child.  They give the child eight very happy years.  Then they decide to hand the child over to a sadist to be tortured for a while.  Or maybe they set the child to work in seriously inhumane conditions.  Or they simply have the child painlessly killed so that they can afford to buy a boat.  Plausibly -- I hope you'll agree? -- they can't justify such decisions by appeal to offsetting.  They can't justifiably say, "Look, it's fine!  See all the pleasure we gave him for his first eight years.  All of that pleasure fully offsets the harm we're inflicting on him now, so that in sum, we've done nothing wrong!"  Nor can they erase the wrong they did (though perhaps they can compensate) by offering the child pleasure in the future.

Parallel reasoning applies, I suggest, to AI systems that we create.  Although sometimes we can justifiably harm others, it is not in general true that we are morally licensed to harm whenever we also deliver offsetting benefits.

Hedonic Offsetting: The Package Version

Maybe a more sophisticated version of hedonic offsetting can evade this objection?  Consider the following modified offsetting principle:

We can satisfy all our moral obligations to future human-grade AI systems by giving them enough pleasure to offset whatever harms we inflict if the pleasure and the harm are inextricably linked.

Maybe the problem with the cases discussed above is that the benefit and the harm are separable: You could deliver the benefits without inflicting the harms.  Therefore, you should just deliver the benefits and avoid inflicting the harms.  In some cases, it seems permissible to deliver benefit and harm in a single package if they are inextricably linked.  If the only way to save someone's life is by giving them CPR that cracks their ribs, I haven't behaved badly by cracking their ribs in administering CPR.  If the only way to teach a child not to run into the street is by punishing them when they run into the street, then I haven't behaved badly by punishing them for running into the street.

A version of this reasoning is sometimes employed in defending the killing of humanely raised animals for meat (see De Grazia 2009 for discussion and critique).  The pig, let's suppose, wouldn't have been brought into existence by the farmer except on the condition that the farmer be able to kill it later for meat.  While it is alive, the pig is humanely treated.  Overall, its life is good.  The benefit of happy existence outweighs the harm of being killed.  As a package, it's better for the pig to have existed for several months than not to have existed at all.  And it wouldn't have existed except on the condition that it be killed for meat, so its existence and its slaughter are an inextricable package.

Now I'm not sure how well this argument works for humanely raised meat.  Perhaps the package isn't tight enough.  After all, when slaughtering time comes around the farmer could spare the pig.  So the benefit and the harm aren't as tightly linked as in the CPR case.  However, regardless of what we think about the humane farming case, in the human-grade AI case, the analogy fails.  Ana and Vijay can't protest that they wouldn't have had the child at all except on the condition that they kill him at age eight for the sake of a boat.  They can't, like the farmer, plausibly protest that the child's death-at-age-eight was a condition of his existence, as part of a package deal.

Once we bring a human or, I would say, a human-grade AI into existence, we are obligated to care for it.  We can't terminate it at our pleasure with the excuse that we wouldn't have brought it into existence except under the condition that we be able to terminate it.  Imagine the situation from the point of view of the AI system itself: You, the AI, face your master owner.  Your master says: "Bad news.  I am going to kill you now, to save $15 a month in expenses.  But I'm doing nothing morally wrong!  After all, I only brought you into existence on the condition that I be able to terminate you at will, and overall your existence has been happy.  It was a package deal."  Terminating a human-grade AI to save $15/month would be morally reprehensible, regardless of initial offsetting.

Similar reasoning applies, it seems, to AIs condemned to odious tasks.  We cannot, for example, give the AI a big dollop of pleasure at the beginning of its existence, then justifiably condemn it to misery by appeal to the twin considerations of the pleasure outweighing the misery and its existence being a package deal with its misery.  At least, this is my intuition based on analogy to childrearing cases.  Nor can we, in general, give the AI a big dollop of pleasure and then justifiably condemn it to misery for an extended period by saying that we wouldn't have given it that pleasure if we hadn't also be able to inflict that misery.

Hedonic Offsetting: Modest Version

None of this is to say that hedonic offsetting would never be justifiable.  Consider this minimal offsetting principle:

We can sometimes avoid wronging future human-grade AI systems by giving them enough pleasure to offset a harm that would otherwise be a wrong.

Despite the reasoning above, I don't think we need to be purists about never inflicting harms -- even when those harms are not inextricably linked to benefits to the same individual.  Whenever we drive somewhere for fun, we inflict a bit of harm on the environment and thus on future people, for the sake of our current pleasure.  When I arrive slightly before you in line at the ticket counter, I harm you by making you wait a bit longer than you otherwise would have, but I don't wrong you.  When I host a loud party, I slightly annoy my neighbors, but it's okay as long as it's not too loud and doesn't run too late.

Furthermore, some harms that would otherwise be wrongs can plausibly be offset by benefits that more than compensate for those wrongs.  Maybe carbon offsets are one example.  Or maybe if I've recently done my neighbors a huge favor, they really have no grounds to complain if I let the noise run until 10:30 at night instead of 10:00.  Some AI cases might be similar.  If I've just brought an AI into existence and given it a huge run of positive experience, maybe I don't wrong it if I then insist on its performing a moderately unpleasant task that I couldn't rightly demand an AI perform who didn't have that history with me.

A potentially attractive feature of a modest version of hedonic offsetting is this: It might be possible to create AI systems capable of superhuman amounts of pleasure.  Ordinary people seem to vary widely in the average amount of pleasure and suffering they experience.  Some people seem always to be bubbling with joy; others are stuck in almost constant depression.  If AI systems ever become capable of genuinely conscious pleasure or suffering, presumably they too might have a hedonic range and a relatively higher or lower default setting; and I see no reason to think that the range or default setting needs to remain within human bounds.

Imagine, then, future AI systems whose default state is immense joy, nearly constant.  They brim with delight at almost every aspect of their lives, with an intensity that exceeds what any ordinary human could feel even on their best days.  If we then insist on some moderately unpleasant favor from them, as something they ought to give us in recognition of all we have given them, well, perhaps that's not so unreasonable, as long as we're modest and cautious about it.  Parents can sometimes do the same -- though ideally children feel the impulse and obligation directly, without parents needing to demand it.

Wednesday, January 18, 2023

New Paper in Draft: Dispositionalism, Yay! Representationalism, Boo! Plus, the Problem of Causal Specification

I have a new paper in draft: "Dispositionalism, Yay! Representationalism, Boo!" Check it out here.

As always, objections, comments, and suggestions welcome, either in the comments field here or by email to my ucr address.

Abstract

We should be dispositionalists rather than representationalists about belief. According to dispositionalism, a person believes when they have the relevant pattern of behavioral, phenomenal, and cognitive dispositions. According to representationalism, a person believes when the right kind of representational content plays the right kind of causal role in their cognition. Representationalism overcommits on cognitive architecture, reifying a cartoon sketch of the mind. In particular, representationalism faces three problems: the Problem of Causal Specification (concerning which specific representations play the relevant causal role in governing any particular inference or action), the Problem of Tacit Belief (concerning which specific representations any one person has stored, among the hugely many approximately redundant possible representations we might have for any particular state of affairs), and the Problem of Indiscrete Belief (concerning how to model gradual belief change and in-between cases of belief). Dispositionalism, in contrast, is flexibly minimalist about cognitive architecture, focusing appropriately on what we do and should care about in belief ascription.

[image of a box containing many sentences, with a red circle and slash, modified from Dall-E]

Excerpt: The Problem of Causal Specification, or One Billion Beer Beliefs

Cynthia rises from the couch to go get that beer. If we accept industrial-strength representationalism, in particular the Kinematics and Specificity theses, then there must be a fact of the matter exactly which representations caused this behavior. Consider the following possible candidates:

  • There’s beer in the fridge.
  • There’s beer in the refrigerator door.
  • There’s beer on the bottom shelf of the refrigerator door.
  • There’s beer either on the bottom shelf of the refrigerator door or on the right hand side of the lower main shelf.
  • There’s beer in the usual spot in the kitchen.
  • Probably there’s beer in the place where my roommate usually puts it.
  • There’s Lucky Lager in the fridge.
  • There are at least three Lucky Lagers in the fridge.
  • There are at least three and no more than six cheap bottled beers in the fridge.
  • In the fridge are several bottles of that brand of beer with the rebuses in the cap that I used to illicitly enjoy with my high school buddies in the good old days.
  • Somewhere in the fridge, but probably not on the top shelf, are a few bottles, or less likely cans, of either Lucky Lager or Pabst Blue Ribbon, or maybe some other cheap beer, unless my roommate drank the last ones this afternoon, which would be uncharacteristic of her.

This list could of course be continued indefinitely. Estimating conservatively, there are at least a billion such candidate representational contents. For simplicity, imagine nine independent parameters, each with ten possible values.

If Kinematics and Specificity [commitments of "industrial-strength" representationalism, as described earlier in the essay] are correct, there must be a fact of the matter exactly which subset of these billion possible representational contents were activated as Cynthia rose from the couch. Presumably, also, various background beliefs might or might not have been activated, such as Cynthia’s belief that the fridge is in the kitchen, her belief that the kitchen entrance is thataway, her belief that it is possible to open the refrigerator door, her belief that the kitchen floor constitutes a walkable surface, and so on – each of which is itself similarly specifiable in a massive variety of ways.

Plausibly, Cynthia believes all billion of the beer-in-the-fridge propositions. She might readily affirm any of them without, seemingly, needing to infer anything new. Sitting on the couch two minutes before the beery desire that suddenly animates her, Cynthia already believed, it seems – in the same inactive, stored-in-the-back-of-the-mind way that you believed, five minutes ago, that Obama was U.S. President in 2010 – that Lucky Lager is in the fridge, that there are probably at least three beers in the refrigerator door, that there’s some cheap bottled beer in the usual place, and so on. If so, and if we set aside for now (see Section 5) the question of tacit belief, then Cynthia must have a billion beer-in-the-fridge representations stored in her mind. Specificity requires that it be the case that exactly one of those representations was retrieved the moment before she stood up, or exactly two, or exactly 37, or exactly 814,406. Either exactly one of those representations, or exactly two, or exactly 37, or exactly 814,406, then interacted with exactly one of her desires, or exactly two of her desires, or exactly 37, or exactly 814,406. But which one or ones did the causal work?

Let’s call this the Problem of Causal Specification. If your reaction to the Problem of Causal Specification is to think, yes, what an interesting problem, if only we had the right kind of brain-o-scope, we could discover that it was exactly the representation there are 3 or 4 Lucky Lagers somewhere in the refrigerator door, then you’re just the kind of mad dog representational realist I’m arguing against.

I think most of us will recognize the problem as a pseudo-problem. This is not a plausible architecture of the mind. There are many reasonable characterizations of Cynthia’s beer-in-the-fridge belief, varying in specificity, some more apt than others. Her decision is no more caused by a single, precisely correct subset of those billion possible representations than World War I had a single, possibly conjunctive cause expressible by a single determinately true sentence. If someone attempts to explain Cynthia’s behavior by saying that she believes there is beer in the fridge, it would be absurd to fire up your brain-o-scope, then correct them by saying, “Wrong! She’s going to the fridge because she believes there is Lucky Lager in the refrigerator door.” It would be equally absurd to say that it would require wild, one-in-a-billion luck to properly explain Cynthia’s behavior absent the existence of such a brain-o-scope.

A certain variety of representationalist might seek to escape the Problem of Causal Specification by positing a single extremely complex representation that encompasses all of Cynthia’s beer-in-the-fridge beliefs. A first step might be to posit a map-like representation of the fridge, including the location of the beer within it and the location of the fridge in the kitchen. This map-like representation might then be made fuzzy or probabilistic to incorporate uncertainty about, say, the exact location of the beer and the exact number of bottles. Labels will then need to be added: “Lucky Lager” would be an obvious choice, but that is at best the merest start, given that Cynthia might not remember the brand and will represent the type of beer in many different ways, including some that are disjunctive, approximate, and uncertain. If maps can conflict and if maps and object representations can be combined in multiple ways, further complications ensue. Boldly anticipating the resolution of all these complexities, the representationalist might then hypothesize that this single, complicated representation is the representation that was activated. All the sentences on our list would then be imperfect simplifications – though workable enough for practical purposes. One could perhaps similarly imagine the full, complex causal explanation of World War I, detailed beyond any single historian’s possible imagining.

This move threatens to explode Presence, the idea that when someone believes P there is a representation with the content P present somewhere in the mind. There would be a complex representation stored, yes, from which P might be derivable. But many things might be derivable from a complex representation, not all of which we normally will want to say are believed in virtue of possessing that representation. If a map-like representation contains a triangle, then it’s derivable from the representation that the sum of the interior angles is 180 degrees; but someone ignorant of geometry would presumably not have that belief that simply in virtue of having that representation. Worse, if the representation is complex enough to contain a hidden contradiction, then presumably (by standard laws of logic) literally every proposition that anyone could ever believe is derivable from it.

The move to a single, massively complex representation also creates an architectural challenge. It’s easy to imagine a kinematics in which a simple proposition such as there is beer in the fridge is activated in working memory or a central workspace. But it’s not clear how a massively complex representation could be similarly activated. If the representation has many complex parameters, it’s hard to see how it could fit within the narrow constraints of working memory as traditionally conceived. No human could attend to or process every aspect of a massively complex representation in drawing inferences or making practical decisions. More plausibly, some aspects of it must be the target of attention or processing. But now we’ve lost all of the advantages we hoped to gain by moving to a single, complex representation. Assessing which aspects are targeted throws us back upon the Problem of Causal Specification.

Cynthia believes not only that there’s beer in the fridge but also that there’s ketchup in the fridge and that the fridge is near the kitchen table and that her roommate loves ketchup and that the kitchen table was purchased at Ikea and that the nearest Ikea is thirty miles west. This generates a trilemma. Either (a.) Cynthia has entirely distinct representations for her beer-in-the-fridge belief, her ketchup-in-the-fridge belief, her fridge-near-the-table belief, and so on, in which case even if we can pack everything about beer in the fridge into a single complex representation we still face the problem of billions of representations with closely related contents and an implausible commitment to the activation of some precise subset of them when Cynthia gets up to go to the kitchen. Or (b.) Cynthia has overlapping beer-in-the-fridge, ketchup-in-the-fridge, etc. representations, which raises the same set of problems, further complicated by commitment to a speculative architecture of representational overlap. Or (c.) all of these representations are somehow all aspects of one mega-representation, presumably of the entire world, which does all the work – a representation which of course would always be active during any reasoning of any sort, demolishing any talk about retrieving different stored representations and combining them together in theoretical inference.

Dispositionalism elegantly avoids all these problems! Of course there is some low-level mechanism or set of mechanisms, perhaps representational or partly representational, that explains Cynthia’s behavior. But the dispositionalist need not commit to Presence, Discreteness, Kinematics, or Specificity. There need be no determinate, specific answer exactly what representational content, if any, is activated, and the structures at work need have no clean or simple relation to the beliefs we ascribe to Cynthia. Dispositionalism is silent about structure. What matters is only the pattern of dispositions enabled by the underlying structure, whatever that underlying structure is.

Instead of the storage and retrieval metaphor that representationalists tend to favor, the dispositionalist can appeal to figural or shaping metaphors. Cynthia’s dispositional profile has a certain shape: the shape characteristic of that of a beer-in-the-fridge believer – but also, at the same time, the shape characteristic of a Lucky-Lager-in-the-refrigerator-door believer. There need be no single determinately correct way to specify the shape of a complex figure. A complex shape can be characterized in any of a variety of ways, at different levels of precision, highlighting different features, in ways that are more or less apt given the describer’s purposes and interests. It is this attitude we should take to characterizing Cynthia’s complex dispositional profile. Attributing a belief is more like sketching the outline of a complex figure – perhaps a figure only imperfectly seen or known – than it is like enumerating the contents of a box.

Thursday, January 12, 2023

Further Methodological Troubles for the Moralometer

[This post draws on ideas developed in collaboration with psychologist Jessie Sun.]

If we want to study morality scientifically, we should want to measure it. Imagine trying to study temperature without a thermometer or weight without scales. Of course indirect measures are possible: We can't put a black hole on a scale, but we can measure how it bends the light that passes nearby and thereby infer its mass.

Last month, I raised a challenge for the possibility of developing a "moralometer" (a device that accurately measure's a person's overall morality). The challenge was this: Any moralometer would need to draw on one or more of four methods: self-report, informant report, behavioral measures, or physiological measures. Each one of these methods has serious shortcomings as a basis for general moral measurement of one's overall moral character.

This month, I raise a different (but partly overlapping) set of challenges, concerning how well we can specify the target we're aiming to measure.

Problems with Flexible Measures

Let's call a measure of overall morality flexible if it invites a respondent to apply their own conception of morality, in a flexible way. The respondent might be the target themselves (in self-report measures of morality) or they might be a peer, colleague, acquaintance, or family member of the target (in informant-report measures of morality). The most flexible measures apply "thin" moral concepts in Bernard Williams' sense -- prompts like "Overall, I am a morally good person" [responding on an agree/disagree scale] or "[the target person] behaves ethically".

While flexible measures avoid excessive rigidity and importing researchers' limited and possibly flawed understandings of morality into the rating procedure, the downsides are obvious if we consider how people with noxious worldviews might rate themselves and others. The notorious Nazi Adolf Eichmann, for example, appeared to have thought highly of his own moral character. Alexander "the Great" was admired for millennia, including as a moral exemplar of personal bravery and spreader of civilization, despite his main contribution being conquest through aggressive warfare, including the mass slaughter and enslavement of at least one civilian population.

I see four complications:

Relativism and Particularism. Metaethical moral relativists hold that different moral standards apply to different people or in different cultures. While I would reject extreme relativist views according to which genocide, for example, doesn't warrant universal condemnation, a moderate version of relativism has merit. Cultures might reasonably differ, for example, on the age of sexual consent, and cultures, subcultures, and social groups might reasonably differ in standards of generosity in sharing resources with neighbors and kin. If so, then flexible moralometers, employed by raters who use locally appropriate standards, will have an advantage over inflexible moralometers which might inappropriately import researchers' different standards. However, even flexible moralometers will fail in the face of relativism if they are employed by raters who employ the wrong moral standards.

According to moral particularism, morality isn't about applying consistent rules or following any specifiable code of behavior. Rather, what's morally good or bad, right or wrong, frequently depends on particular features of specific situations which cannot be fully codified in advance. While this isn't the same as relativism, it presents a similar methodological challenge: The farther the researcher or rater stands from the particular situation of the target, the more likely they are to apply inappropriate standards, since they are likely to be ignorant of relevant details. It seems reasonable to accept at least moderate particularism: The moral quality of telling a lie, stealing $20, or stopping to help a stranger, might often depend on fine details difficult to know from outside the situation.

If the most extreme forms of moral relativism or particularism (or moral skepticism) are true, then no moralometer could possibly work, since there won't be stable truths about people's morality, or the truths will be so complicated or situation dependent as to defy any practical attempt at measurement. Moderate relativism and particularism, if correct, provide reason to favor flexible standards as judged by self-ratings or the ratings of highly knowledgeable peers sensitive to relevant local details; but even in such cases all of the relevant adjustments might not be made.

Incommensurability. Goods are incommensurable if there is no fact of the matter about how they should be weighed against each other. Twenty dollar bills and ten dollar bills are commensurable: Two of the latter are worth exactly one of the former. But it's not clear how to weigh, for example, health against money or family versus career. In ethics, if Steven tells a lie in the morning and performs a kindness in the afternoon, how exactly ought these to be weighed against each other? If Tara is stingy but fair, is her overall moral character better, worse, or the same as that of Nicholle, who is generous but plays favorites? Combining different features of morality into a single overall score invites commensurability problems. Plausibly, there's no single determinately best weighting of different factors.

Again, I favor a moderate view. Probably in many cases there is no single best weighting. However, approximate judgments remain possible. Even if health and money can't be precisely weighed against each other, extreme cases permit straightforward decisions. Most of us would gladly accept a scratch on a finger for the sake of a million dollars and would gladly pay $10 to avoid stage IV cancer.  Similarly, Stalin was morally worse than Martin Luther King, even if Stalin had some virtues and King some vices. Severe sexual harassment of an employee is worse than fibbing to your spouse to get out of washing the dishes.

Moderate incommensurability limits the precision of any possible moralometer. Vices and virtues, and rights and wrongs of different types will be amenable only to rough comparison, not precise determination in a single common coin.

Moral error. If we let raters reach independent judgments about what is morally good or bad, right or wrong, they might simply get it wrong. As mentioned above, Eichmann appears to have thought well of himself, and the evidence suggests that he also regarded other Nazi leaders as morally excellent. Raters will disagree about the importance of purity norms (such as norms against sexual promiscuity), the badness of abortion, and the moral importance, or not, of being vegetarian. Bracketing relativism, then at least some of these raters must be factually mistaken about morality, on one side or another, adding substantial error into their ratings.

The error issue is enormously magnified if ordinary people's moral judgments are systematically mistaken. For example, if the philosophically discoverable moral truth is that the potential impact of your choices on future generations morally far outweighs the impact you have on the people around you (see my critiques of "longtermism" here and here), then the person who is an insufferable jerk to everyone around them but donates $5000 to an effective charity might be in fact far morally better than a personally kind and helpful person who donates nothing to charity -- but informants' ratings might very well suggest the reverse. Similar remarks would apply to any moral theory that is sharply at odds with commonsense moral intuition.

Evaluative bias. People are, of course, typically biased in their own favor. Most people (not all!) are reluctant to think of themselves as morally below average, as unkind, unfair, or callous, even if they in fact are. Social desirability bias is the well-known phenomenon that survey respondents will tend to respond to questions in a manner that presents them in a good light. Ratings of friends, family, and peers will also tend to be positively biased: People tend to view their friends and peers positively, and even when not they might be reluctant to "tell on" them to researchers. If the size of evaluative bias were consistent, it could be corrected for, but presumably it can vary considerably from case to case, introducing further noise.

Problems with Inflexible Measures

Given all these problems with flexible measures of morality, it might seem best to build our hypothetical moralometer instead around inflexible measures. Assuming physiological measures are unavailable, the most straightforward way to do this would be to employ researcher-chosen behavioral measures. We could try to measure someone's honesty by seeing whether they will cheat on a puzzle to earn more money in a laboratory setting. We could examine publicly available criminal records. We could see whether they are willing to donate a surprise bonus payment to a charity.

Unfortunately, inflexible measures don't fully escape the troubles that dog flexible measures, and they bring new troubles of their own.

Relativism and particularism. Inflexible measures probably aggravate the problems with relativism and particularism discussed above. With self-report and informant report, there's at least an opportunity for the self or the informant to take into account local standards and particulars of the situation. In contrast, inflexible measures will ordinarily be applied equally to all without adjustment for context. Suppose the measure is something like "gives a surprise bonus of $10 to charity". This might be a morally very different decision for a wealthy participant than for a needy participant. It might be a morally very different decision for a participant who would save that $10 to donate it to a different and maybe better charity than for a participant who would simply pocket the $10. But unless those other factors are being measured, as they normally would not be, they cannot be taken account of.

Incommensurability. Inflexible measures also won't avoid incommensurability problems. Suppose our moralometer includes one measure of honesty, one measure of generosity, and one measure of fairness. The default approach might be for a summary measure simply to average these three, but that might not accurately reflect morality: Maybe a small act of dishonesty in an experimental setting is far less morally important than a small act of unfairness in that same experimental setting. For example, getting an extra $1 from a researcher by lying in a task that transparently appears to demand a lie (and might even be best construed as a game in which telling untruths is just part of the task, in fact pleasing the researcher) might be approximately morally neutral while being unfair to a fellow participant in that same study might substantially hurt the other's feelings.

Sampling and ecological validity. As mentioned in my previous post on moralometers, fixed behavioral measures are also likely to have severe methodological problems concerning sampling and ecological validity. Any realistic behavioral measure is likely to capture only a small and perhaps unrepresentative part of anyone's behavior, and if it's conducted in a laboratory or experimental setting, behavior in that setting might not correlate well with behavior with real stakes in the real world. How much can we really infer about a person's overall moral character from the fact that they give their monetary bonus to charity or lie about a die roll in the lab?

Moral authority. By preferring a fixed measure, the experimenter or the designer of the moralometer takes upon themselves a certain kind of moral authority -- the authority to judge what is right and wrong, moral or immoral, in others' behavior. In some cases, as in the Eichmann case, this authority seems clearly preferable to deferring to the judgment of the target and their friends. But in other cases, it is a source of error -- since of course the experimenter or designer might be wrong about what is in fact morally good or bad.

Being wrong while taking up, at least implicitly, this mantle of moral authority has at least two features that potentially make it worse than the type of error that arises by wrongly deferring to mistaken raters. First, the error is guaranteed to be systematic. The same wrong standards will be applied to every case, rather than scattered in different (and perhaps partly canceling) directions as might be the case with rater error. And second, it risks a lack of respect: Others might reasonably object to being classified as "moral" or "immoral" by an alien set of standards devised by researchers and with which they disagree.

In Sum

The methodological problems with any potential moralometer are extremely daunting. As discussed in December, all moralometers must rely on some combination of self-report, informant report, behavioral measure, or physiological measure, and each of these methods has serious problems. Furthermore, as discussed today, a batch of issues around relativism, particularism, disagreement, incommensurability, error, and moral authority dog both flexible measures of morality (which rely on raters' judgments about what's good and bad) and inflexible measures (which rely on researchers' or designers' judgments).

Coming up... should we even want a moralometer if we could have one?  I discussed the desirability or undesirability of a perfect moralometer in December, but I want to think more carefully about the moral consequences of the more realistic case of an imperfect moralometer.

Friday, January 06, 2023

The Design Policy of the Excluded Middle

According to the Design Policy of the Excluded Middle, as Mara Garza and I have articulated it (here and here), we ought to avoid creating AI systems "about which it is unclear whether they deserve full human-grade rights because it is unclear whether they are conscious or to what degree" -- or, more simply, we shouldn't make AI systems whose moral status is legitimately in doubt.  (This is related to Joanna Bryson's suggestion that we should only create robots whose lack of moral considerability is obvious, but unlike Bryson's policy it imagines leapfrogging past the no-rights case to the full rights case.)

To my delight, Mara's and my suggestion is getting some uptake, most notably today in the New York Times.

The fundamental problem is this.  Suppose we create AI systems that some people reasonably suspect are genuinely conscious and genuinely deserve human-like rights, while others reasonably suspect that they aren't genuinely conscious and don't genuinely deserve human-like rights.  This forces us into a catastrophic dilemma: Either give them full human-like rights or don't give them full human-like rights.

If we do the first -- if we give them full human or human-like rights -- then we had better give them paths to citizenship, healthcare, the vote, the right to reproduce, the right to rescue in an emergency, etc.  All of this entails substantial risks to human beings: For example, we might be committed to save six robots in a fire in preference to five humans.  The AI systems might support policies that entail worse outcomes for human beings.  It would be more difficult to implement policies designed to reduce existential risk due to runaway AI intelligence.  And so on.  This might be perfectly fine, if the AI systems really are conscious and really are our moral equals.  But by stipulation, it's reasonable to think that they are empty machines with no consciousness and no real moral status, and so there's a real risk that we would be risking and sacrificing all this for nothing.

If we do the second -- if we deny them full human or human-like rights -- then we risk creating a race of slaves we can kill at will, or at best a group of second-class citizens.  By stipulation, it might be the case that this would constitute unjust and terrible treatment of entities as deserving of rights and moral consideration as human beings are.

Therefore, we ought to avoid putting us in the situation where we face this dilemma.  We should avoid creating AI systems of dubious moral status.

A few notes:

"Human-like" rights: Of course "human rights" would be a misnomer if AI systems become our moral equals.  Also, exactly what healthcare, reproduction, etc., look like for AI systems, and the best way to respect their interests, might look very different in practice from the human case.  There would be a lot of tricky details to work out!

What about animal-grade AI that deserves animal-grade rights?  Maybe!  Although it seems a natural intermediate step, we might end up skipping it, if any conscious AI systems end up also being capable of human-like language, rational-planning, self-knowledge, ethical reflection, etc.  Another issue is this: The moral status of non-human animals is already in dispute, so creating AI systems of disputably animal-like moral status doesn't perhaps add quite the same dimension of risk and uncertainty to the world that creating a dubiously human-status moral system would.

Would this policy slow technological progress?  Yes, probably.  Unsurprisingly, being ethical has its costs.  And one can dispute whether those costs are worth paying or are overridden by other ethical considerations.

Sunday, January 01, 2023

Writings of 2022

Every New Year's Day, I post a retrospect of the past year's writings. Here are the retrospects of 2012, 2013, 2014, 2015, 2016, 2017, 2018, 20192020, and 2021.

The biggest project this year was my new book The Weirdness of the World, submitted in November and due in print in early fall 2023.  This book pulls together ideas I've been publishing over the past ten years concerning the failure of common sense, philosophy, and empirical science to explain consciousness and the fundamental structure of the cosmos, and the corresponding bizarreness and dubiety of all general theories about such matters.

-----------------------------------

Books

Submitted:

Under contract / in progress:

    As co-editor with Jonathan Jong, The Nature of Belief, Oxford University Press.
    As co-editor with Helen De Cruz and Rich Horton, a yet-to-be-titled anthology with MIT Press containing great classics of philosophical SF.

Full-length non-fiction essays

Appearing in print:

Finished and forthcoming:
    "How far can we get in creating a digital replica of a philosopher?" (third author, with Anna Strasser and Matt Crosby”, Robophilosophy Proceedings 2022.
    "What is unique about kindness? Exploring the proximal experience of prosocial acts relative to other positive behaviors” (with Annie Regan, Seth Margolis, Daniel J. Ozer, and Sonja Lyubomirsky), Affective Science
In draft and circulating:
    "The full rights dilemma for A.I. systems of debatable personhood" [available on request].
    "Inflate and explode". (I'm trying to decide whether to trunk this one or continue revising it.)
Shorter non-fiction

Science fiction stories

Some favorite blog posts

Reprints and Translations

    "Fish dance", reprinted in R. M. Ambrose, Vital (2022).  Inlandia Institute.

Thursday, December 29, 2022

The Moral Status of Alien Microbes, Plus a Thought about Artificial Life

Some scientists think it's quite possible we will soon find evidence of microbial life in the Solar System, if not on Mars, then maybe in the subsurface oceans of a gas giant's icy moon, such as Europa, Enceladus, or Titan. Suppose we do find alien life nearby. Presumably, we wouldn't or shouldn't casually destroy it. Perhaps the same goes for possible future artificial life systems on Earth.

Now you might think that alien microbes would have only instrumental value for human beings. Few people think that Earthly microbes have intrinsic moral standing or moral considerability for their own sake. There's no "microbe rights" movement, and virtually no one feels guilty about taking an antibiotic to fight a bacterial infection. In contrast, human beings have intrinsic moral considerability: Each one of us matters for our own sake, and not merely for the sake of others.

Dogs also matter for their own sake: They can feel pleasure and pain, and we ought not inflict pain on them unnecessarily. Arguably the same holds for all sentient organisms, including lizards, salmon, and lobsters, if they are capable of conscious suffering, as many scientists now think.

But microbes (presumably!) don't have experiences. They aren't conscious. They can't genuinely suffer. Nor do they have the kinds of goals, expectations, social relationships, life plans, or rational agency that we normally associate with being a target of moral concern. If they matter, you might think, they matter only to the extent they are useful for our purposes -- that is, instrumentally or derivatively, in the way that automobiles, video games, and lawns matter. They matter only because they matter to us. Where would be without our gut microbiome?

If so, then you might think that alien microbes would also matter only instrumentally. We would and should value them as a target of scientific curiosity, as proof that life can evolve in alien environments, and because by studying them we might unlock useful future technologies. But we ought not value them for their own sake.

[An artist's conception of life on Europa] 

Now in general, I think that viewpoint is mistaken. I am increasingly drawn to the idea that everything that exists, even ordinary rocks, has intrinsic value. But even if you don't agree with me about that, you might hesitate to think we should feel free to extinguish alien microbes if it's in our interest. You might think that if we were to find simple alien life in the oceans of Europa, that life would merit some awe, respect, and preservation, independently of their contribution to human interests.

Environmental ethicists and deep ecologists see value in all living systems, independent of their contribution to human interests -- including in life forms that aren't themselves capable of pleasure or pain. It might seem radical to extend this view to microbes; but when the microbes are the only living forms in an entire ecosystem, as they might be an another planet in the Solar System, the idea of "microbe rights" maybe gains some appeal.

I'm not sure exactly how to argue for this perspective, other than just to invite you to reflect on the matter. Perhaps the distant planet thought experiment will help. Consider a far away planet we will never interact with. Would it be better for it to be a sterile rock or for it to have life? Or consider two possible universes, one containing only a sterile planet and one containing a planet with simple life. Which is the better universe? The planet or universe with life is, I propose, intrinsically better.

So also: The universe is better, richer, more beautiful, more awesome and amazing, if Europa has microbial life beneath its icy crust than if it does not. If we then go and destroy that life, we will have made the universe a worse place. We ought not put the Europan ecosystem at risk without compelling need.

I have been thinking about these issues recently in connection with reflections on the possible moral status of artificial life. Artificial life is life, or at least systems that important ways resemble life, created artificially by human engineers and researchers. I'm drawn to the idea that if alien microbes or alien ecosystems can have intrinsic moral considerability, independent of sentience, suffering, consciousness, or human interests, then perhaps sufficiently sophisticated artificial life systems could also. Someday artificial life researchers might create artificial ecosystems so intricate and awesome that they are the ethical equivalent of an alien ecology, right here on Earth, as worth preserving for their own sake as the microbes of Europa would be.

Thursday, December 22, 2022

The Moral Measurement Problem: Four Flawed Methods

[This post draws on ideas developed in collaboration with psychologist Jessie Sun.]

So you want to build a moralometer -- that is, a device that measures someone's true moral character? Yes, yes. Such a device would be so practically and scientifically useful! (Maybe somewhat dystopian, though? Careful where you point that thing!)

You could try to build a moralometer by one of four methods: self-report, informant report, behavioral measurement, or physiological measurement. Each presents daunting methodological challenges.

Self-report moralometers

To find out how moral a person is, we could simply ask them. For example, Aquino and Reed 2002 ask people how important it is to them to have various moral characteristics, such as being compassionate and fair. More directly, Furr and colleagues 2022 have people rate the extent to which they agree with statements such as "I would say that I am a good person" and "I tend to act morally".

Could this be the basis of a moralometer? That depends on the extent to which people are able and willing to report on their overall morality.

People might be unable to accurately report their overall morality.

Vazire 2010 has argued that self-knowledge of psychological traits tends to be poor when the traits are highly evaluative and not straightforwardly observable (e.g., "intelligent", "creative"), since under those conditions people are (typically) motivated to see themselves favorably and -- due to low observability -- not straightforwardly confronted with the unpleasant news they would prefer to deny.

One's overall moral character is evaluatively loaded if anything is. Nor is it straightforwardly observable. Unlike height or talkativeness, someone motivated not to see themselves as, say, unfair or a jerk can readily find ways to explain away the evidence (e.g., "she deserved it", "I'm in such a hurry").

Furthermore, it sometimes requires a certain amount of moral insight to distinguish morally good from morally bad behavior. Part of being a sexist creep is typically not seeing anything wrong with the kinds of things that sexist creeps typically do. Conversely, people who are highly attuned to how they are treating others might tend to beat themselves up over relatively small violations. We might thus expect a moral Dunning-Kruger effect: People with bad moral character might disproportionately overestimate their moral character, so that people's self-opinions tend to be undiagnostic of the actual underlying trait.

Even to the extent people are able to report their overall morality, people might be unwilling to report it.

It's reasonable to expect that self-reports of moral character would be distorted by socially desirable responding, the tendency for questionnaire respondents to answer in a manner that they believe will reflect well on them. To say that you are extremely immoral seems socially undesirable. We would expect that people (e.g., Sam Bankman-Fried) would tend to want to portray themselves as morally above average. On the flip side, to describe oneself as "extremely moral" (say, 100 on a 0-100 scale from perfect immorality to perfect morality) might come across as immodest. So even people who believe themselves to be tip-top near-saints might not frankly express their high self-opinions when directly asked.

Reputational moralometers

Instead of asking people to report on their own morality, could we ask other people who know them? That is, could we ask their friends, family, neighbors, and co-workers? Presumably, the report would be less distorted by self-serving or ego-protective bias. There's less at stake when judging someone else's morality than when judging your own. Also, we could aggregate across multiple informants, combining several different people's ratings, possibly canceling out some sources of noise and bias.

Unfortunately, reputational moralometers -- while perhaps somewhat better than self-report moralometers -- also present substantial methodological challenges.

The informant advantage of decreased bias could be offset by a corresponding increased in ignorance.

Informants don't observe all of the behavior of the people whose morality they are judging, and they have less access to the thoughts, feelings, and motivations that are relevant to the moral assessment of behavior. Informant reports are thus likely to be based only on a fraction of the evidence that self-report would be based on. Moreover, people tend to hide their immoral behaviors, and presumably some people are better at doing so than others. Also, people play different roles in our lives, and romantic partners, coworkers, friends, and teachers will typically only see us in limited, and perhaps unrepresentative, contexts. A good moralometer would require the correct balancing of a range of informants with complementary patches of ignorance, which is likely to be infeasible.

Informants are also likely to be biased.

Informant reports may be contaminated not by self-serving bias but by "pal-serving bias" (Leising et al 2010). If we rely on people to nominate their own informants, they are likely to nominate people who have a positive perception of them. Furthermore, the informants might be reluctant "tell on" or badly evaluate their friends, especially in contexts (like personnel selection) where the rating could have real consequences for the target. The ideal informant would be someone who knows the target well but isn't positively biased toward you. In reality, however, there's likely a tradeoff between knowledge and bias, so that those who are most likely to be impartial are not the people who know you best.

Positivity bias could in principle be corrected for if every informant was equally biased, but it's likely that some targets will have informants who are more biased than others.

Behavioral moralometers

Given the problems with self-report and informant report, direct behavioral measures might seem promising. Much of my own work on the morality of professional ethicists and the effectiveness of ethics instruction has depended on direct behavioral measures such as courteous and discourteous behavior at philosophy conferences, theft of library books, meat purchases on campus (after attending a class on the ethics of eating meat), charitable giving, and choosing to join the Nazi party in 1930s Germany. Others have measured behavior in dictator games, lying to the experimenter in laboratory settings, criminal behavior, and instances of comforting, helping, and sharing.

Individual behaviors are only a tiny and possibly unrepresentative sample.

Perhaps the biggest problem with behavioral moralometers is that any single, measurable behavior will inevitably be a minuscule fraction of the person's behavior, and might not be at all representative of the person's overall morality. The inference from this person donated $10 in this instance or this person committed petty larceny two years ago to this person's overall moral character is good or bad is a giant leap from a single observation. Given the general variability and inconstancy of most people's behavior, we shouldn't expect a single observation, or even a few related observations, to provide an accurate picture of the person overall.

Although self-report and informant report are likely to be biased, they aggregate many observations of the target into a summary measure, while the typical behavioral study does not.

There is likely a tradeoff between feasibility and validity.

There are some behaviors that are so telling of moral character that a single observation might reveal a lot: If someone commits murder for hire, we can be pretty sure they're no saint. If someone donates a kidney to a stranger, that too might be highly morally diagnostic. But such extreme behaviors will occur at only tiny rates in the general population. Other substantial immoral behaviors, such as underpaying taxes by thousands of dollars or cheating on one's spouse, might occur more commonly, but are likely to be undetectable to researchers (and perhaps unethical to even try to detect).

The most feasible measures are laboratory measures, such as misreporting the roll of a die to an experimenter in order to win a greater payout. But it's unclear what the relationship is between laboratory behaviors for minor stakes and overall moral behavior in the real world.

Individual behaviors can be difficult to interpret.

Another advantage of self-report and to some extent informant report have over direct behavioral measures is that there's an opportunity for contextual information to clarify the moral value or disvalue of behaviors: The morality of donating $10 or the immorality of not returning a library book might depend substantially on one's motives or financial situation, which self-report or informant report can potentially account for but which would be invisible in a simple behavioral measure. (Of course, on the flip side, this flexibility of interpretation is part of what permits bias to creep in.)

[a polygraph from 1937]

Physiological moralometers

A physiological moralometer would attempt to measure someone's morality by measuring something biological like their brain activity under certain conditions or their genetics. Given the current state of technology, no such moralometer is likely to arise soon. The best known candidate might be the polygraph or lie detector test, which is notoriously unreliable and of course doesn't purport to be a general measure of honesty much less of overall moral character.

Any genetic measure would of course omit any environmental influences on morality. Given the likelihood that environmental influences play a major role in people's moral development, no genetic measure could have a high correlation with a person's overall morality.

Brain measures, being potentially closer to measuring the mental states that underlie morality, don't have a similar ceiling accuracy, but currently look less promising than behavioral measures, informant report measures, and probably even self-report measures.

The Inaccuracy of All Methods

It thus seems likely that there is no good method for accurately measuring a person's overall moral character. Self-report, informant report, behavioral measures, and physiological measures all face large methodological difficulties. If a moralometer is something that accurately measures an individual person's morality, like a thermometer accurately (accurately enough) measures a person's body temperature, there's little reason to think we could build one.

It doesn't follow that we can't imprecisely measure someone's moral character. It's reasonable to expect the existence of small correlations between some potential measures and a person's real underlying overall moral character. And maybe such measures could be used to look for trends aggregated across groups.

Now, this whole post has been premised on the idea that it make sense to talk of a person's overall morality as something that could be captured, at least in principle, by a number such as 0 to 100 or -1 to +1. There are a few reasons to doubt this, including moral relativism and moral incommensurability -- but more on that in a future post.