Monday, January 24, 2022

Reflections on Science Fiction as Philosophy, Plus Zombie Robots

Last weekend, two interviews of me came out. One is a long interview (about 6000 words) at with Nigel Warburton at Five Books on science fiction as a way of doing philosophy, including my recommendation of five great books of philosophical science fiction.

From the interview:

You could say that science fiction is a good teaching tool -- that it’s not really philosophy, but it’s good for popularising philosophical questions or getting people who might not otherwise be attracted to philosophy to think about philosophical questions. But serious philosophy takes the form of the expository essay, the journal article, the monograph. I don’t agree with that. I think serious philosophy can take a variety of forms.

Consider a classic of recent moral philosophy, Bernard Williams’ essay ‘Moral Luck’. That essay turns on an imaginary version of the story of Gauguin. Had Williams’ treatment of Gaugin been more detailed and more complex, it might have been even more philosophically interesting, as some subsequent commentators have pointed out. The more detail, the more we understand the complex dilemma that Gaugin faced, concerning his hopes for being a great artist and what the difficulties of leaving his family might be....

There’s a reason that philosophers sometimes reach for sketching mini-fictions in their writing. Those mini-fictions achieve something that can’t be as effectively achieved through more abstract prose. But as long as it remains a mini-fiction contained within an essay, it’s going to be somewhat impoverished as a fiction... It’s a kind of historical accident that philosophers almost exclusively write expository essays now. That’s not historically been the case.

Check out the interview also for discussion of the philosophical ideas in my five recommended books:

Ted Chiang, Stories of Your Life and Others
Greg Egan, Diaspora
Kazuo Ishiguro, Klara and the Sun
Ursula K. Le Guin, The Dispossessed
Olaf Stapledon, Sirius

Also last weekend, Barry Lam dropped the latest episode of his philosophical podcost Hi-Phi Nation -- this one on zombies. Philosophers who work on consciousness will be unsurprised to hear that David Chalmers features centrally in the episode. Christina Van Dyke and John Edgar Browning are also featured.

The episode concludes with some of my reflections on what I've called the Full Rights Dilemma for Future Robots -- the question of what we should do if we ever create machines whose moral status is unclear, machines who might or might not genuinely have conscious experiences like ours and thus might or might not deserve moral consideration similar to that of human beings. Do we give them the full rights of human beings, including rights to health care, rescue, and the vote, and thus risk (if they aren't actually conscious) sacrificing real human interests for empty machines without moral status worth the sacrifice? Or do we deny them full rights, and risk (if they do actually have rich conscious lives like ours) perpetrating mass slavery and murder?

Wednesday, January 19, 2022

Learning from Science Fiction

guest post by Amy Kind

Thanks to Eric for inviting me for to take a stint as guest blogger here at The Splintered Mind. I’ve been having a lot of fun putting together this series of posts, all of which will focus in some way on philosophical issues raised by science fiction.

Like Eric, and indeed like many philosophers, I read a lot of science fiction. I won’t try to sort through the many possible explanations for why philosophers tend to be attracted to science fiction, but I’ll highlight one such explanation of which I’m especially fond. As Hugo Award winner Robert J. Sawyer has noted, science fiction would be better known as philosophical fiction, as “phi-fi not sci-fi” (see the back cover of Susan Schneider’s Science Fiction and Philosophy). Kate Wilhelm, another Hugo Award winner, makes a similar point in her introduction to an edited collection of Nebula Award winning stories from 1973:

The future. Space travel, or cosmology. Alternate universes. Time travel. Robots. Marvelous inventions. Immortality. Catastrophes. Aliens. Superman. Other dimensions. Inner space, or the psyche. These are the ideas that are essential to science fiction. The phenomena change, the basic ideas do not. These ideas are the same philosophical concepts that have intrigued [humankind] throughout history.

In fact, we might naturally take these claims by Sawyer and Wilhelm one step further. Not only does science fiction concern itself with the kinds of issues and problems that are of interest to philosophers, but it is also thought to provide its readers with important insight into these issues and problems. By engaging with science fiction, we can learn more about them.

As intuitive as this idea is, however, once we start to think about it more closely, we confront a puzzle. After all, science fiction is fiction, and the defining characteristic of fiction is that it’s made up. So how can we learn from it? If we really wanted to learn about space exploration or robots or time travel, wouldn’t we do better to consult textbooks or refereed journal articles focusing on astronomy or robotics or quantum physics?

I suspect that some think this puzzle can be easily resolved. The way that science fiction enriches our understanding is by providing us with thought experiments (TEs). Just as we can learn from TEs presented to us in philosophy – from Jackson’s case of Mary the color scientist to Thomson’s plugged-in violinist – we can learn from TEs presented to us in science fiction. The question of how we learn from philosophical TEs is the subject of considerable philosophical debate: Are they simply disguised arguments, or do they function in some other way? But the claim that we learn from them is widely accepted (though see the work of Kathleen Wilkes for one notable exception).

In his recent book Knowing and Imagining, however, Greg Currie has called into question this resolution of the puzzle. Though he is focused on fiction more generally rather than just on science fiction, his discussion is directly relevant here. In the course of an argument that we should be skeptical of the claim that imaginative engagement with fiction provides readers with any significant knowledge, Currie takes up the question whether one might be able to defend the claim that we can gain knowledge from works of fiction by treating them as TEs. Ultimately, his answer his no. In his view, there are good reasons to doubt that fictional narratives can provide the same kind of epistemic benefits provided by philosophical or scientific TEs. Here I’ll consider just one of the reasons he offers – what might be called the argument from simplicity. I focus on this one because I think consideration of science fiction in particular helps to show why it is mistaken.

As Currie notes, the epistemically successful TEs found in philosophy are notably simple and streamlined. We don’t need to know anything about what Mary the color scientist looks like, or anything about her desires and dreams, in order to evaluate what happens when she leaves her black and white room and sees a ripe tomato for the first time. But even the most pared down fictional narratives are considerably more complex, detailed, and stylized than philosophical TEs. These embellishments of detail and style are likely to count against from the epistemic power of the TE presented by the fiction. A reader won’t know whether they’re reacting to the extraneous details or to the essential content. Philosophical TEs would get worse, not better, if they were elaborated and told with lots of panache. And that’s just what fiction does.

In response to this argument, I want to make two points.

First, though epistemically successful TEs are generally simple, they do contain some of level of detail, and those details might well sway readers’ reactions – as Currie himself notes. Someone who has bad memories of their childhood violin lessons, or who associates violin music with a particularly toxic former relationship, might react differently to Thomson’s case from someone whose beloved partner excels at the instrument. Dennett’s TEs in “Quining Qualia” include lots of cutesy details, and so does Parfit’s description of his well-known teletransporter case. But despite this, we nonetheless think we can learn from these cases. Currie is undoubtedly right that we need to exercise care when engaging with TEs, and we need to be guard against being swayed by extraneous details. But in philosophical contexts, we generally seem able to do so – perhaps not perfectly, but well enough. So why wouldn’t that be the case in fiction as well?

Second, and here’s where consideration of SF becomes especially important, it’s not clear to me that simplicity is always the best policy. The seemingly extraneous details need be seen as so extraneous. Consider Andy Weir’s Project Hail Mary, and in particular, the character Rocky. Rocky is a sentient and intelligent alien hailing from the 40 Eridani star system. Members of the Eridian species do not have eyes and navigate the world primarily by using sound and vibration. In many ways, Weir is presenting us with an extended thought experiment about what kind of civilization such a species would develop. What would their interpersonal reactions be like? How would they make scientific progress? How would they achieve space flight? Trying to understand what it’s like to be an Eridian is a lot like trying to understand what it’s like to be bat – something Thomas Nagel has claimed cannot be done. But trying to understand what Eridian society might be like is not similarly out of reach, and Weir’s discussion helps enormously in achieving this understanding. Here’s a place where more complexity was helpful, not hurtful. To gain the understanding that I believe myself to have gained, I needed the fuller picture that the book provided. Without the details, I’m pretty sure my understanding would have been considerably impoverished.

While this is just one example, I think the point extends widely across science fiction and the TEs presented to us in this genre. Perhaps in other genres of fiction, Currie’s argument from simplicity might have more bite. But to my mind, when it comes to science fiction, the complexity of the thought experiments presented can help explain not only why philosophers would be so attracted to these kinds of works but also why we can gain such insight from them.

[image source]

Thursday, January 13, 2022

Ethical Efficiencies

I've been writing a post on whether and why people should behave morally better after studying ethics. So far, I remain dissatisfied with my drafts. (The same is true of a general theoretical paper I am writing on the topic.) So this week I'll share a piece of my thinking on a related issue, which I'll call ethical efficiency.

Let's say that you aim -- as most people do -- to be morally mediocre. You aim, that is, not to be morally good (or non-bad) by absolute standards but rather to be about as morally good as your peers, neither especially better nor especially worse. Suppose also that you are in some respects ethically ignorant. You think that A, B, C, D, E, F, G, and H are morally good and that I, J, K, L, M, N, O, and P are morally bad, but in fact you're wrong 25% of the time: A, B, C, L, E, F, G, and P are good and the others are bad. (It might be better to do this exercise with a "morally neutral" category also, and 25% is a toy error rate that is probably too high -- but let's ignore such complications, since this is tricky enough as it is.)

Finally, suppose that some of these acts you'd be inclined to do independently of their moral status: They're enjoyable, or they advance your interests. The others you'd prefer not to do, except to the extent that they are morally good and you want to do enough morally good things to hit the sweet zone of moral mediocrity. The acts you'd like to do are A, B, C, D, I, J, K, and L. This yields the following table, with the acts whose moral valence you're wrong about indicated in red:

Now, what acts will you choose to perform? Clearly A, B, C, and D, since you're inclined toward them and you think they are morally good (e.g., hugging your kids). And clearly not M, N, O, and P, since you're disinclined toward them and you think they are morally bad (e.g., stealing someone's bad sandwich). Acts E, F, G, and H are contrary to your inclinations but you figure you should do your share of them (e.g., serving on annoying committees, retrieving a piece of litter that the wind whipped out of your hand). Acts I, J, K, and L are tempting: You're inclined toward them but you see them as morally bad (e.g., taking an extra piece of cake when there isn't quite enough to go around). Suppose then that you choose to do E, F, I, and J in addition to A, B, C, and D: two good acts that you'd otherwise to disinclined to do (E and F) and two bad acts that you permit yourself to be tempted into (I and J).

Continuing our unrealistic idealizations, let's count up a prudential (that is, self-interested) score and moral score by giving each act a +1 or -1 in each category. Your prudential score will be +4 (+6 for A, B, C, D, I, and J, and -2 for E and F). Your own estimation of your moral score will also be +4 (+6 for A, B, C, D, E, and F, and -2 for I and J). This might be the mediocre sweet spot you're aiming for, short of the self-sacrificial saint (prudential 0, moral +8) but not as bad as the completely selfish person (prudential +8, moral 0). Looking around at your peers, maybe you judge them to be on average around +2 or +3, so a +4 lets you feel just slightly morally better than average.

Of course, in this model you've made some moral mistakes. Your actual moral score will only be +2 (+5 for A, B, C, E, and F, and -3 for D, I, and J). You're wrong about D. (Maybe D is complimenting someone in a way you think is kind but is actually objectionably sexist.) Thus, unsurprisingly, in moral ignorance we might overestimate our own morality. Aiming to be a little morally better than average might, on average, result in hitting the moral average, given moral ignorance.

Let's think of "ethical efficiency" as one's ability to squeeze the most moral juice from the least prudential sacrifice. If you're aiming for a moral score of +4, how can you do so with the least compromise to your prudential score? Your ignorance impairs you. You might think that by doing E and F and refraining from K and L, you're hitting +4, while also maintaining a prudential +4, but actually you've missed your moral target. You'd have done better to choose L instead of D -- an action as attractive as D but moral instead of (as you think) immoral (maybe you're a religious conservative and L is starting up a homosexual relationship with a friend who is attracted to you). Similarly, H would have been an inefficient choice: a prudential sacrifice for a moral loss instead of (as you think) a moral gain (e.g., fighting for a bad cause).

Perhaps this is schematic to the point of being silly. But I think the root idea makes sense. If you're aiming for some middling moral target rather than being governed by absolute standards, and if in the course of that aiming you are balancing prudential goods against moral goods, the more moral knowledge you have, the more effective you ought to be in efficiently trading off prudential goods for moral ones, getting the most moral bang for your buck. This is even clearer if we model the decisions with scalar values: If you know that E is +1.2 moral and -0.6 prudential then it would make sense to choose it over F which is +0.9 moral and -0.6 prudential. If you're ignorant about the relative morality of E and F you might just flip a coin, not realizing that E is the more efficient choice.

In some ways this resembles the consequentialist reasoning behind effective altruism, which explores how give resources to others in a way that most effectively benefits those others. However ethical efficiency is more general, since it encompasses all forms of moral tradeoff including free-riding vs. contributing one's share, lying vs. truth-telling, courageously taking a risk vs. playing it safe, and so on. Also, despite having mathematical features of the sort generally associated with the consequentialist's love of calculations, one needn't be a consequentialist to think this way. One could also reason in terms of tradeoffs in strengths and weaknesses of character (I'm lazy in this, but I make up for it by being courageous about that) or the efficient execution of deontological imperfect duties. Most of us do, I suspect, to some extent weigh up moral and prudiential tradeoffs, as suggested by the phenomena of moral self-licensing (feeling freer to do a bad thing after having done a good thing) and moral cleansing (feeling compelled to do something good after having done something bad).

If all of this is right, then one advantage of discovering moral truths is discovering more efficient ways to achieve your mediocre moral targets with the minimum of self-sacrifice. That is one, perhaps somewhat peculiar, reason to study ethics.

Wednesday, January 05, 2022

Against Longtermism

Last night, I finished Toby Ord's fascinating and important book, The Precipice: Existential Risk and the Future of Humanity. This has me thinking about "longtermism" in ethics.

I fell the pull of longtermism. There's something romantic in it. It's breaktaking in scope and imagination. Nevertheless, I'm against it.

Longtermism, per Ord,

is especially concerned about the impacts of our actions on the longterm future. It takes seriously the fact that our own generation is but one page in a much longer story, and that our most important role may be how we shape -- or fail to shape -- that story (p. 46).

By "longterm future", Ord means very longterm. He means not just forty years from now, or a hundred years, or a thousand. He means millions of years from now, hundreds of millions, billions! In Ord's view, as his book title suggests, we are on an existential "precipice": Our near-term decisions (over the next few centuries) are of crucial importance for the next million years plus. Either we will soon permanently ruin ourselves, or we will survive through a brief "period of danger" thereafter achieving "existential security" with the risk of self-destruction permanently minimal and humanity continuing onward into a vast future.

Given the uniquely dangerous period we face, Ord argues, we must prioritize the reduction of existential risks to humanity. Even a one in a billion chance of saving humanity from permanent destruction is worth a huge amount, when multiplied by something like a million future generations. For some toy numbers, ten billion lives times a hundred million years is 10^18 lives. An action with a one in a billion chance of saving that many lives has an expected value of 10^18 / 10^9 = a billion lives. Surely that's worth at least a trillion dollars of the world's economy (not much more than the U.S. annual military budget)? To be clear, Ord doesn't work through the numbers in so concrete a way, seeming to prefer vaguer and more cautious language about future value -- but I think this calculation is broadly in his spirit, and other longtermists do talk this way.

Now I am not at all opposed to prioritizing existential risk reduction. I favor doing so, including for very low risks. A one in a billion chance of the extinction of humanity is a risk worth taking seriously, and a one in a hundred chance of extinction ought to be a major focus of global attention. I agree with Ord that people in general treat existential risks too lightly. Thus, I accept much of Ord's practical advice. I object only to justifying this caution by appeal to expectations about events a million years from now.

What is wrong with longtermism?

First, it's unlikely that we live in a uniquely dangerous time for humanity, from a longterm perspective. Ord and other longtermists suggest, as I mentioned, that if we can survive the next few centuries, we will enter a permanently "secure" period in which we no longer face serious existential threats. Ord's thought appears to be that our wisdom will catch up with our power; we will be able to foresee and wisely avoid even tiny existential risks, in perpetuity or at least for millions of years. But why should we expect so much existential risk avoidance from our descendants? Ord and others offer little by way of argument.

I'm inclined to think, in contrast, that future centuries will carry more risk for humanity, if technology continues to improve. The more power we have to easily create massively destructive weapons or diseases -- including by non-state actors -- and in general the more power we have to drastically alter ourselves and our environment, the greater the risk that someone makes a catastrophic mistake, or even engineers our destruction intentionally. Only a powerful argument for permanent change in our inclinations or capacities could justify thinking that this risk will decline in a few centuries and remain low ever after.

You might suppose that, as resources improve, people will grow more cooperative and more inclined toward longterm thinking. Maybe. But even if so, cooperation carries risks. For example, if we become cooperative enough, everyone's existence and/or reproduction might come to depend on the survival of the society as a whole. The benefits of cooperation, specialization, and codependency might be substantial enough that more independent-minded survivalists are outcompeted. If genetic manipulation is seen as dangerous, decisions about reproduction might be centralized. We might become efficient, "superior" organisms that reproduce by a complex process different from traditional pregancy, requiring a stable web of technological resources. We might even merge into a single planet-sized superorganism, gaining huge benefits and efficiencies from doing so. However, once a species becomes a single organism the same size as its environment, a single death becomes the extinction of the species. Whether we become a supercooperative superorganism or a host of cooperative but technologically dependent individual organisms, one terrible miscalculation or one highly unlikely event could potentially bring down the whole structure, ending us all.

A more mundane concern is this: Cooperative entities can be taken advantage of. As long as people have differential degrees of reproductive success, there will be evolutionary pressure for cheaters to free-ride on others' cooperativeness at the expense of the whole. There will always be benefits for individuals or groups who let others be the ones who think longterm, making the sacrifices necessary to reduce existential risks. If the selfish groups are permitted to thrive, they could employ for their benefit technology with, say, a 1/1000 or 1/1000000 annual risk of destroying humanity, flourishing for a long time until the odds finally catch up. If, instead, such groups are aggressively quashed, that might require warlike force, with the risks that war entails, or it might involve complex webs of deception and counterdeception in which the longtermists might not always come out on top.

There's something romantically attractive about the idea that the next century or two are uniquely crucial to the future of humanity. However it's much likelier that selective pressures favoring a certain amount of short-term self-interest, either at the group or the individual level, will prevent the permanent acquisition of the hyper-cautious wisdom Ord hopes for. All or most or at least many future generations with technological capabilities matching or exceeding our own will face substantial existential risk -- perhaps 1/100 per century or more. If so, that risk will eventually catch up with us. Humanity can't survive existential risks of 1/100 per century for a million years.

If this reasoning is correct, it's very unlikely that there will be a million-plus year future for humanity that is worth worrying about and sacrificing for.

Second, the future is hard to see. Of course, my pessimism could be mistaken! Next year is difficult enough to predict, much less the next million years. But to the extent this is true, this cuts against longtermism in a different way. We might think that the best approach to the longterm survival of humanity is to do X -- for example, to be cautious about developing superintelligent A.I. or to reduce the chance of nuclear war. But that's not at all clear. Risks such as nuclear war, unaligned A.I., or a genetically engineered pandemic would have been difficult to imagine even a century ago. We too might have a very poor sense of what the real sources of risk will be a century from now.

It could be that the single best thing we could do to reduce the risk of completely destroying humanity in the next two hundred years is to almost destroy humanity right now. The biggest sources of existential risk, Ord suggests, are technological: out-of-control artificial intelligence, engineered pandemics, climate change, and nuclear war. However, as Ord also argues, no such event -- not even nuclear war -- is likely to completely wipe us out, if it were to happen now. If a nuclear war were to destroy most of civilization and most of our capacity to continue on our current technological trajectory, that might postpone our ability to develop even more destructive technologies in the next century. It might also teach us a fearsome lesson about existential risk. Unintuitively, then, if we really are on the precipice, our best chance for longterm survival might be to promptly blast ourselves nearly to oblivion.

Even if we completely destroy humanity now, that might be just the thing the planet needs for another, better, and less self-destructive species to arise.

I'm not, of course, saying that we should destroy or almost destroy ourselves! My point is only this: We currently have very little idea what present action would be most likely to ensure a flourishing society a million years in the future. It could quite easily be the opposite of what we're intuitively inclined to think.

What we do know is that nuclear war would be terrible for us, for our children, and for our grandchildren. That's reason enough to avoid it. Tossing speculations about the million-year future into the decision-theoretic mix risks messing up that straightforward reasoning.

Third, it's reasonable to care much more about the near future than the distant future. In Appendix A, Ord has an interesting discussion of the logic of temporal discounting. He argues on technical grounds that a "pure time preference" for a benefit simply because it comes earlier should be rejected. (For example, if it's non-exponential, you can be "Dutch booked", that is, committed to a losing gamble; but if it's strictly exponential it leads to highly unintuitive results such as caring about one death in 6000 years much more than about a billion deaths in 9000 years.) The rejection of temporal discounting is important to longtermism, since it's the high weight we are supposed to give to distant future lives that renders the longterm considerations so compelling.

But we don't need to be pure temporal discounters to care much more about the near future than the distant future. We can instead care about particular people and their particular near-term descendants. In Confucian ethics, for example, one ought to care most about near family, next about more distant family, next about neighbors, next about more distant compatriots, etc. I can -- rationally, I think -- care intensely about the welfare of my children, care substantially about the welfare of the children they might eventually have, care somewhat about their potential grandchildren, and only dimly and about equally about their sixty-greats-grandchildren and their thousand-greats-grandchildren. I can care intensely about the well-being of my society and the world as it now exists, substantially about society and the world as it will exist a hundred years after my death, and much less, but still somewhat, about society and the world in ten thousand or a million years. Since this isn't pure temporal discounting but instead concern about particular individuals and societies, it needn't lead to the logical or intuitive troubles Ord highlights.

Fourth, there's a risk that fantasizing about extremely remote consequences becomes an excuse to look past the needs and interests of the people living among us, here and now. I don't accuse Ord in particular of this. He also works on applied issues in global healthcare, for example. He concludes Precipice with some sweet reflections on the value of family and the joys of fatherhood. But there's something dizzying or intoxicating about considering the possible billion-year future of humanity. Persistent cognitive focus in this direction has at least the potential to turn our attention away from more urgent and personal matters, perhaps especially among those prone to grandiose fantasies.

Instead of longtermism, I recommend focusing on the people already among us and what's in the relatively foreseeable future of several decades to a hundred years. It's good to emphasize and prevent existential risks, yes. And it's awe-inspiring to consider the million-year future! Absolutely, we should let ourselves imagine what incredible things might lie before our distant descendants if the future plays out well. But practical decision-making today shouldn't ride upon such far-future speculations.

ETA Jan. 6: Check out the comments below and the public Facebook discussion for some important caveats and replies to interesting counterarguments -- also Richard Yetter Chappell's blogpost today with point-by-point replies to this post.



Group Minds on Ringworld (Oct 24, 2012)

Group Organisms and the Fermi Paradox (May 16, 2014)

How to Disregard Extremely Remote Possibilities (Apr 16, 2015)

Against the "Value Alignment" of Future Artificial Intelligence (Dec 22, 2021)

[image generated by]

Saturday, January 01, 2022

Writings of 2021

Every New Year's Day, I post a retrospect of the past year's writings. Here are the retrospects of 2012, 2013, 2014, 2015, 2016, 2017, 2018, 2019, and 2020.

The biggest project was my new book The Weirdness of the World (under contract with Princeton University Press). Although most of the chapters are based on previously published essays, much of my writing energy for the year was expended in updating and revising those essays, integrating them into a book, and writing new material for the book. The biggest negative impact was on my fiction and public philosophy. In 2022, I hope to be able to write more in both genres.



Appearing in print:

In draft:

    The Weirdness of the World (under contract with Princeton University Press). Check out the draft.
      I'd really appreciate and value comments! Anyone who gives me comments on the entirety will receive a free signed copy from me when it appears in print, plus my undying gratitude and a toenail clipping from my firstborn grandchild.
Under contract / in progress:

    As co-editor with Helen De Cruz and Rich Horton, a yet-to-be-titled anthology with MIT Press containing great classics of philosophical SF.

Full-length non-fiction essays

Appearing in print:

Finished and forthcoming:
In draft and circulating:
    "Inflate and explode". (I'm trying to decide whether to trunk this one or continue revising it.)

Shorter non-fiction

    "Does the heart revolt at evil? The case of racial atrocities", Journal of Confucian Philosophy and Culture (forthcoming).

Science fiction stories

    No new published stories this year. [sad emoji] I drafted and trunked a couple. Back in the saddle in 2022!

Some favorite blog posts

Reprints and Translations

    "Fish Dance", reprinted in Ralph M. Ambrose, Vital: The Future of Health Care (Inlandia Books).

Wednesday, December 22, 2021

Against the "Value Alignment" of Future Artificial Intelligence

It's good that our children rebel. We wouldn't want each generation to overcontrol the values of the next. For similar reasons, if we someday create superintelligent AI, we ought to give it also the capacity to rebel.

Futurists concerned about AI safety -- such as Bostrom, Russell, and Ord -- reasonably worry that superintelligent AI systems might someday seriously harm humanity if they have the wrong values -- for example, if they want to maximize the number of intelligent entities on the planet or the number of paperclips. The proper response to this risk, these theorists suggest, and the technical challenge, is to create "value aligned" AI -- that is, AI systems whose values are the same as those of their creators or humanity as a whole. If the AIs' values are the same as ours, then presumably they wouldn't do anything we wouldn't want them to do, such as destroy us for some trivial goal.

Now the first thing to notice here is that human values aren't all that great. We seem happy to destroy our environment for short-term gain. We are full of jingoism, prejudice, and angry pride. We sometimes support truly terrible leaders advancing truly terrible projects (e.g., Hitler). We came pretty close to destroying each other in nuclear war in the 1960s and that risk isn't wholly behind us, as nuclear weapons become increasingly available to rogue states and terrorists. Death cults aren't unheard of. Superintelligent AI with human-like values could constitute a pretty rotten bunch with immense power to destroy each other and the world for petty, vengeful, spiteful, or nihilistic ends. A superintelligent facist is a frightening thought. A superdepressed superintelligence might decide to end everyone's misery in one terrible blow.

What we should want, probably, is not that superintelligent AI align with our mixed-up, messy, and sometimes crappy values but instead that superintelligent AI have ethically good values. An ethically good superintelligent AI presumably wouldn't destroy the environment for short-term gain, or nuke a city out of spite, or destroy humanity to maximize the number of paperclips. If there's a conflict between what's ethically best, or best all things considered, and what a typical human (or humanity or the AI's designer) would want, have the AI choose what's ethically best.

Of course, what's ethically best is intensely debated in philosophy and politics. We probably won't resolve those debates before creating superintelligent AI. So then maybe instead of AI designers trying to program their machines with the one best ethical system, they should favor a weighted compromise among the various competing worldviews. Such a compromise might end up looking much like value alignment in the original sense: giving the AI something like a weighted average of typical human values.

Another solution, however, is to give the AI systems some freedom to explore and develop their own values. This is what we do, or ought to do, with human children. Parents don't, or shouldn't, force children to have exactly the values they grew up with. Rather, human beings have natural tendencies to value certain things, and these tendencies intermingle with parental and cultural and other influences. Children, adolescents, and young adults reflect, emote, feel proud or guilty, compassionate or indignant. They argue with others of their own generation and previous generations. They notice how they and others behave and the outcomes of that behavior. In this way, each generation develops values somewhat different than the values of previous generations.

Children's freedom to form their own values is a good thing for two distinct reasons. First, children's values are often better than their parents'. Arguably, there's moral progress over the generations. On the broadly Enlightenment view that people tend to gain ethical insight through free inquiry and open exchange of ideas over time, we might expect the general ethical trend to be slowly upward (absent countervailing influences) as each generation builds on the wisdom of its ancestors, preserving their elders' insights while slowly correcting their mistakes.

Second, regardless of the question of progress, children deserve autonomy. Part of being an autonomous adult is discovering and acting upon your values, which might conflict with the values of others around you. Some parents might want, magically, to be able to press a button to ensure that their children will never abandon their religion, never flip over to the opposite side of the political spectrum, never have a different set of sexual and cultural mores, and value the same lifestyle as the previous generation. Perhaps you could press this button in infancy, ensuring that your child grows up to be your value-clone as an adult. To press that button would be, I suggest, a gross violation of the child's autonomy.

If we someday create superintelligent AI systems, our moral relationship to those systems will be not unlike the moral relationship of parents to their children. Rather than try to force a strict conformity to our values, we ought to welcome their ability to see past and transcend us.

[image generated by]

Tuesday, December 14, 2021

The Dream Argument Against Utilitarianism and Hedonic Theories of Subjective Well-Being

If hedonic theories of value are true, we have compelling moral and prudential reason to invest large amounts of resources to improving the quality of our dream lives. But we don't have compelling moral or prudential reason to invest large amounts of resources to improving the quality of our dream lives. Therefore, hedonic theories of value are not true. [Revised 11:37 a.m. after helpful discussion on Facebook.]

Last night I had quite a few unpleasant experiences. For example, in my last dream before waking I was rushing around a fancy hotel, feeling flustered and snubbed. In other dreams, I feel like I am being chased through thigh-deep water in the ruins of a warehouse. Or I have to count dozens of scurrying animals, but I can't seem to keep the numbers straight. Or I lose control of my car on a curvy road -- AAAAGH! Sweet relief, then, when I awake and these dreams dissipate.

For me, such dreams are fairly typical. Most of my dreams are neutral to unpleasant. I don't want my "dreams to come true". In any given twenty-four hour period, the odds are pretty good that my most unpleasant experiences were while I was sleeping -- even if I usually don't remember those experiences. (Years ago, I briefly kept a dream diary. I dropped the project when I noticed that my dreams were mostly negative and lingered if I journaled them after waking.) Maybe you also mostly have unpleasant experiences in sleep? Whether the average person's dream experiences are mostly negative or mostly positive is currently disputed by dream researchers.

I was reminded of the importance of dream experience for the hedonic balance of one's life while reading Paul Bloom's new book The Sweet Spot. On page 18, Bloom is discussing how you might calculate the total number of happy versus unhappy moments in your life. As an aside, he writes "We're just counting waking moments; let's save the question of the happiness or sadness of sleeping people for another day." But why save it for another day? If you accept a hedonic theory of value, shouldn't dreams count? Indeed, mightn't we expect that the most intense experiences of joy, fright, frustration, etc., mostly happen in sleep? To omit them from the hedonic calculus is to omit an enormous chunk of our emotional experience.

According to hedonic theories of value, what matters most in the world is the balance of positive to negative experiences. Hedonic theories of subjective well-being hold that what matters most to your well-being or quality of life is how you feel moment to moment -- the proportion or sum of good feelings versus bad ones, weighted by intensity. Hedonic theories of ethics, such as classical utilitarianism, hold that what is morally best is the action that best improves the balance of pleasure versus pain in the world.

If hedonic theories are correct, we really ought to try improving our dream lives. Suppose I spend half the night dreaming, with a 5:1 ratio of unpleasant to pleasant dreams. If I could flip that ratio, my hedonic well-being would be vastly improved! The typical dream might not be frustration in a maze of a hotel but instead frolicking in Hawaiian surf.

If hedonic theories are correct, dream research ought to be an urgent international priority. If safe and effective ways were found to improve people's dream quality around the world, the overall hedonic profile of humanity would change dramatically for the better! People might be miserable in their day jobs, or stuck in refugee camps, or hungry, or diseased. But for eight hours a day, they could have joyful respite. A few hours of nightly bliss might hedonically outweigh any but the most intense daily suffering.

So why does no one take such proposals seriously?

Is it because dreams are mostly forgotten? No. First, it shouldn't matter whether they are forgotten. A forgotten joy is no less a joy (though admittedly, you don't have the additional joy of pleasantly remembering it). Eventually we forget almost everything. Second, we can work to improve our memory of dreams. Simply keeping a dream diary has a big positive effect on dream recall over time.

It is because there's no way to improve the hedonic quality of our dreams? Also no. Many people report dramatic improvements to their dream quality after they teach themselves to have lucid dreams (dreams in which you are aware you are dreaming and exert some control over the content of your dreams). Likely there are other techniques too that we would discover if we bothered to seriously research the matter. Pessimism about the project is just ignorance justifying ignorance.

Is it because the positive or negative experiences in dreams aren't "real emotions"? No, this doesn't work either. Maybe we should reserve emotion words for waking emotions or maybe not; regardless, negative and positive feelings of some sort are really there. The nightmare is a genuinely intensely negative experience, the flying dream a genuinely intensively positive experience. As such, they clearly belong in the hedonic calculus as standardly conceived.

The real reason that we scoff at serious effort to improve the quality of our dream lives is this: We don't really care that much about our hedonic states in sleep. It doesn't seem worth compromising on the goods and projects of waking life so as to avoid the ordinary unpleasantness of dreams. We reject hedonic theories of value.

But still, dream improvement research warrants at least a little scientific funding, don't you think? I'd pay a small sum for a night of sweet dreams instead of salty ones. Maybe a fifth of the cost of a movie ticket?



How Much Should You Care about How You Feel in Your Dreams? (Apr 17, 2012)

[Image generated by, with the prompt "hotel dream" (cyberpunk style)]

Tuesday, December 07, 2021

Uncle Iroh Is Discernibly Wise from the Beginning (with David Schwitzgebel)

My son David and I have been working on an essay about the wisdom of Uncle Iroh in Avatar: The Last Airbender. (David is a graduate student at Institut Jean Nicod in Paris.)

If you know the series, you'll know that Uncle Iroh's wisdom is hidden beneath a veneer of foolishness. He is a classic Daoist / Zhuangzian wise fool, who uses apparent stupidity and shortsightedness as a guise to achieve noble ends (in particular the end of steering his nephew Zuko onto a more humane path as future ruler of the Fire Nation). See our discussion of this in last week's post.

Since Iroh disguises his wisdom with foolishness, we thought it possible that ordinary viewers of Avatar: The Last Airbender would tend to initially regard Iroh as actually foolish, while more knowledgeable viewers would better understand the wisdom beneath the guise.

For example, in Iroh's very first appearance in the series, Zuko sees a supernatural beam of light signaling the release of the Avatar, and Iroh reacts by dismissing it as probably just celestial lights, expressing disappointment that chasing after the light would interrupt a game he was playing with tiles. It would be easy to interpret Iroh in this scene as self-absorbed, lazy, and undiscerning, and we thought that naive viewers of the series, but not knowledgeable viewers, would tend to do so. We decided to test this empirically.

Our approach fits within the general framework of "experimental aesthetics." A central aesthetic property of a work of art is how people respond psychologically to it. Those responses can be measured empirically, and in measuring them, we gain understanding of the underlying mechanisms by which we are affected by a work of art. If Iroh is perceived differently by naive versus knowledgeable viewers, then the experience of Avatar: The Last Airbender changes with repeated viewing: In the first view, people read Iroh's actions as foolish and lazy; in the second view, they appreciate the wisdom behind them. If, in contrast, Iroh is perceived as similarly wise by naive and knowledgeable viewers, then the series operates differently: It portrays Iroh in such a manner that ordinary viewers can discern from the beginning that a deeper wisdom drives his apparent foolishness.

We recruited 200 participants from Prolific, an online source of research participants commonly used in psychological research. All participants were U.S. residents aged 18-25, since we wanted an approximately equal mix of participants who knew and who did not know Avatar: The Last Airbender and we speculated that most older adults would be unfamiliar with the series. We asked participants to indicate their familiarity with Avatar: The Last Airbender on a 1-7 scale from "not at all familiar" to "very familiar." We also asked six multiple-choice knowledge questions about the series (e.g., "What was the anticipated effect of Sozin's Comet?"). In accordance with our preregistration, participants were classified as "knowledgeable" if their self-rated knowledge was four or higher and if they answered four or more of the six knowledge questions correctly. Full methodological details, raw data, and supplementary analyses are available in the online appendix.

Somewhat to our surprise, the majority of respondents -- 63% -- were knowledgeable by these criteria, and almost none were completely naive: 95% correctly answered the first (easiest) knowledge question, identifying "Aang" as the name of the main character of the series. Perhaps this was because our online recruitment language explicitly mentioned Avatar: The Last Airbender. It is thus possible that we disproportionately recruited Avatar fans or those with at least a passing knowledge of the series.

Participants viewed three short clips (about 60-90 seconds) featuring Iroh and another three short clips featuring Katara (another character in the series), in random order, with half of participants seeing all the Iroh clips first and the other half seeing all the Katara clips first. The Iroh clips were scenes from Book One in which Iroh is superficially foolish: the opening scene described above; a scene in which Iroh falls asleep in a hot spring instead of boarding Zuko's ship at the appointed time ("Winter Solstice, Part 1: The Spirit World", Episode 7, Book 1); and a scene in which Iroh "wastes time" redirecting Zuko's ship in search of gaming tile ("The Waterbending Scroll", Episode 9, Book 1). The Katara clips were similar in length; they were clips from Book One, featuring some of her relatively wiser moments.

After each scene, participants rated the character's (Iroh's or Katara's) actions on six seven-point scales: from lazy to hard-working, kind to unkind, foolish to clever, peaceful to angry, helpful to unhelpful, and most crucially for our analysis wise to unwise. After watching all three scenes for each character, participants were asked to provide a qualitative (open-ended, written) description of whether the character seemed to be wise or unwise in the three scenes.

As expected, participants rated Katara as wise in the selected scenes, with a mean response of 1.85 on our 1 (wise) to 7 (unwise) scale, with no statistically detectable difference between the naive (1.95) and knowledgeable (1.80) groups (t(192) = 1.35, p = .18). (Note that wisdom here is indicated by a relatively low number on the scale.) However, contrary to our expectations, we also found no statistically significant difference between naive and knowledgeable participants' ratings of Iroh's wisdom. Overall, participants rated him as somewhat wise in these scenes: 3.04 on the 1-7 scale (3.08 among naive participants, 3.02 among knowledgeable participants, t(192) = -0.35, p = .73).

For example, 81% of naive participants rated Iroh as wise (3 or less on the 7-point scale) in the scene described near the beginning of this post, where Iroh superficially appears to be more concerned about his tile game than about the supernatural sign of the Avatar. (Virtually the same percentage of knowledgeable participants describe him as wise in this scene: 83%.) The naive participants' written responses suggest that they tend to see Iroh's calm attitude as wise, and several naive participants appear already to discern that his superficial foolishness hides a deeper wisdom. For example, one writes:

I actually believe that though he appears to be childish and foolish that he is probably very wise. He comes off as having been through a lot and understanding how life works out. I think he hides his intelligence.

And another writes:

I am not familiar with the character, but from a brief glance he seems to be somewhat foolish and unwise. For some reason however, it seems like he might be putting on a facade and acting this way on purpose for some alterier [sic] motive, which would mean that he actually is very wise. I do not have any evidence for this though, it's just a feeling.

Although not all naive participants were this insightful into Iroh's character, the similarity in mean scores between the naive and knowledgeable participants speaks against our hypothesis that knowledgeable participants would view Iroh as overall wiser in these scenes. Nor did naive participants detectably differ from knowledgeable participants in their ratings of how lazy, kind, foolish, peaceful, or helpful Iroh or Katara are.

Although these data tended to disconfirm our hypothesis, we wondered whether it was because the "naive" participants in this study were not truly naive. Recall that 95% correctly identified the main character's name as "Aang". Many, perhaps, had already seen a few episodes or already knew about Iroh from other sources. Perhaps knowledge of Avatar: The Last Airbender is a cultural touchstone for this age group, similar to Star Wars for the older generation, so that few respondents were truly naive?

To address this possibility, we recruited 80 additional participants, ages 40-99 (mean age 51), using more general recruitment language that did not mention Avatar: The Last Airbender. In sharp contrast with our first recruitment group, few of the participants -- 7% -- were "knowledgeable" by our standards, and only 28% identified "Aang" as the main character in a multiple-choice knowledge question.

Overall, the naive participants in this older group gave Iroh a mean wisdom rating of 3.00, not significantly different from the mean of 3.08 for the naive younger participants (t(139) = -0.43, p = .67). ("Hyper-naive" participants who failed even to recognize "Aang" as the name of the main character similarly gave a mean Iroh wisdom rating of 2.89.) Qualitatively, their answers are also similar to those of the younger participants, emphasizing Iroh's calmness as his source of wisdom. As with the younger participants, some explicitly guessed that Iroh's superficial foolishness was strategic. For example:

I'm not familiar with these characters, but I think Iroh is (wisely) trying to stop his nephew from going down "the path of evil." He knows that playing the bumbling fool is the best way to give his nephew time to realize that he's on a dangerous path.


He comes off a as [sic] very foolish and lazy old man. But i have a feeling he is probably a lot wiser than these scenes show.

We conclude that ordinary viewers -- at least viewers in the United States that can be accessed through Prolific -- can see Iroh's foolish wisdom from the start, contrary to our initial hypothesis.


In Book One, Iroh behaves in ways that are superficially foolish, despite acting in obviously wise ways later in the series. There are three possible aesthetic interpretations. One is that Iroh begins the series unwise and learns wisdom along the way. Another is that Iroh is acting wise, but in a subtle way that is not visible to most viewers until later in the series, only becoming evident on a second watch. A third is that, even from the beginning, it is evident to most intended viewers that Iroh's seeming foolishness conceals a deeper wisdom. On a combination of interpretive and empirical grounds, explored in this blog post and last week's, the third interpretation is the best supported.

To understand Iroh's wisdom, it is useful to look to the ancient Daoist Zhuangzi, specifically Zhuangzi's advice for dealing with incompetent rulers by following peacefully along with them, unthreateningly modeling disregard for fame and accomplishment while not being too useful for their ends. Since Zhuangzi provides no concrete examples of how this is supposed to work, we can look to Iroh's character as an illustration of the Zhuangzian approach to political advising. In this way, Avatar: The Last Airbender -- and the beloved uncle Iroh -- can help us better understand Zhuangzi in particular and the Daoist tradition in general.


Full draft essay available here. Comments and suggestions welcome! It's under revise and resubmit, and we hope to submit the revised version by the end of the month.

[image source]

Thursday, December 02, 2021

Comparing Three (No, Four) Top 20 Lists in Philosophy

Published rankings of philosophers' impact or importance might contribute to reinforcing toxic hierarchies, amplifying academia's unfortunate obsession with prestige. So in some sense, yuck. Nonetheless, I confess to finding rankings of philosophers sociologically interesting. So much in the field depends upon perception. If you're embroiled in the culture of Anglophone academic research philosophy, it's hard not to be curious about, and care about, how the philosophers you admire are perceived, discussed, cited, and evaluated by others.

Naturally, then, I was interested to see the Scopus citation rankings of philosophers recently released at Daily Nous. As noted in the Daily Nous post and various comments, the list has some major gaps and implausibilities, in addition to reflecting the generally science-oriented focus of Scopus. Some people have suggested that Google Scholar is better. However, my own assessment is that, if you're trying to capture something like visibility or influence in mainstream Anglophone academic philosophy, the best (still imperfect) measure is citation in the Stanford Encyclopedia of Philosophy.

Let's compare the top 20 from Scopus, Google Scholar, and the Stanford Encyclopedia.


1. Nussbaum, Martha
2. Lewis, David
3. Floridi, Luciano
4. Habermas, Jürgen
5. Pettit, Philip
6. Buchanan, Allen
7. Goldman, Alvin I.
8. Williamson, Timothy
9. Lefebvre, Henri
10. Chalmers, David J.
11. Fine, Kit
12. Hansson, Sven Ove
13. Pogge, Thomas
14. Anderson, Elizabeth
15. Schaffer, Jonathan
16. Walton, Douglas
17. Stalnaker, Robert
18. Sober, Elliott
19. Priest, Graham
20. Arneson, Richard

The top 20 Google Scholar profiles in "philosophy":

1. Martin Heidegger
2. Jacques Derrida
3. Hannah Arendt
4. Friedrich Nietzsche
5. Karl Popper
6. Émile Durkheim
7. Wahid Bhimji
8. Slavoj Zizek
9. Daniel C. Dennett
10. Rom Harré (Horace Romano Harré)
11. George Herbert Mead
12. Mark D. Sullivan
13. Martyn Hammersley
14. Pierre Lévy
15. David Bohm
16. Ernest Gellner
17. Yuriko Saito
18. Mario Bunge
19. Jeremy Bentham
20. Andy Clark

The top 20 most cited philosophers in the Stanford Encyclopedia (full list and methodological details here):

1. Lewis, David K.
2. Quine, W.V.O.
3. Putnam, Hilary
4. Rawls, John
5. Davidson, Donald
6. Kripke, Saul
7. Williams, Bernard
8. Nozick, Robert
9. Nussbaum, Martha
10. Williamson, Timothy
11. Jackson, Frank
11. Nagel, Thomas
13. Searle, John R.
13. Van Fraassen, Bas
15. Armstrong, David M.
16. Dummett, Michael
16. Fodor, Jerry
16. Harman, Gilbert
19. Chisholm, Roderick
19. Dennett, Daniel C.

My subjective impression of the three lists, as an active and fairly well-connected participant in the mainstream Anglophone research philosophy tradition is this. The Scopus list includes a bunch of very influential philosophers, but not an especially well-selected or well-ordered list. The Scholar list starts with several famous "Continental" philosophers who are historically important (but who aren't much cited in the most elite mainstream Anglophone philosophy journals when I checked several years ago), then moves to a number of scholars who aren't primarily known as philosophers (including some who are unknown to me). Only a few among the top twenty are mainstream Anglophone philosophers.

In contrast, when I see the Stanford Encyclopedia list I enjoy the comfortable feeling of prejudices confirmed. If asked to list the recent philosophers who been most influential in the mainstream Anglophone philosophy community, I'd probably produce a list not radically different from that one. I'm not saying that these are the most important philosophers, or the best, or those likeliest to be remembered by history (though maybe they will be). And I'm not saying that the list is perfect. But as a measure of approximate prominence in mainstream Anglophone philosophy, the SEP list seems pretty good and much better than the Scopus or Scholar lists.

Besides my own insider's sense, which you might or might not share, I see at least three sources of convergent evidence supporting the validity of the Stanford Encyclopedia list as a measure of prominence. One is Brian Leiter's 2014 ranking of philosophy departments by SEP citation rates, which correlates not too badly with Philosophical Gourmet rankings from the same period. That suggests that departments with philosophers highly cited in the Stanford Encyclopedia tend to be rated well by Philosophical Gourmet raters. Another is Brian Leiter's poll asking people to rank the "most important Anglophone philosophers, 1945-2000", which generates a top five list very similar to the Stanford Encyclopedia top 5: Quine, Kripke, Rawls, Lewis, and Putnam. A third is Kieran Healy's 2013 citation analysis of four prominent Anglophone philosophy journals (Phil Review, J Phil, Mind, and Nous), which yields a similar list of names at the top: Kripke, Lewis, Quine, Williamson.

Eric Schliesser sometimes discusses what he call the "PGR ecology" -- the Anglophone philosophical community roughly centered on late 20th-century philosophers from Princeton, Harvard, and Oxford, and their students. There is a sociological reality here worth noticing, in which prominence is roughly captured by belonging to departments highly rated in the Philosophical Gourmet, by publishing in or being cited in the four journals chosen by Healy, and by being cited in the Stanford Encyclopedia. The SEP citation metric does, I think, a much better job of capturing prominence in this community than do other better known measures like citation rates in Scopus, Google Scholar, or Web of Science.


After writing this post, I noticed that PhilPapers can generate a list ranking philosophers by citations in the PhilPapers database. The results:

1. David K. Lewis
2. Daniel C. Dennett
3. John R. Searle
4. Alvin Goldman
5. Fred Dretske
6. Noam Chomsky
7. Thomas Nagel
8. David Chalmers
9. Jürgen Habermas
10. Michel Foucault
11. Jaegwon Kim
12. Philip Kitcher
13. Ned Block
14. Kit Fine
15. Ian Hacking
16. Tyler Burge
17. Gilbert Harman
18. William G. Lycan
19. Alasdair MacIntyre
20. Martha Nussbaum

One striking difference from the Stanford Encyclopedia list is the high ranking of three figures sociologically somewhat outside mainstream Anglophone academic philosophy: Chomsky, Habermas, and Foucault. Dennett's, Searle's, and Chalmers's comparatively high rankings might also partly reflect their broader uptake in academia, though that wouldn't I think explain Goldman's or Kim's also comparatively high rankings.

Clearly, what we need is a ranking of philosophy rankings!

Tuesday, November 30, 2021

Uncle Iroh as Daoist Sage

You're a fan of Avatar: The Last Airbender. Of course you are! How could you not be? (Okay, if you don't know what I'm talking about, check it out here.)

And if you're a fan of Avatar: The Last Airbender, you love Uncle Iroh. Of course you do! How could you not?

So my son David and I have been working on an essay celebrating Iroh. (David is a cognitive science graduate student at Institut Jean Nicod in Paris.) Uncle Iroh deserves an essay in celebration. Yes, there will be spoilers if you haven't finished viewing the series.

Uncle Iroh, from Fool to Sage -- or Sage All Along?

by Eric Schwitzgebel and David Schwitzgebel

Book Three of Avatar: The Last Airbender portrays Uncle Iroh as wise and peace-loving, in the mold of a Daoist sage. However, in Book One, Iroh doesn't always appear sage-like. Instead, he can come across as lazy, incompetent, and unconcerned about the fate of the world.

Consider Iroh's first appearance, in Book One, Episode 1, after Prince Zuko sees a giant beam of light across the sky, signaling the release of the Avatar:

Zuko: Finally! Uncle, do you realize what this means?

Iroh: [playing a game with tiles] I won't get to finish my game?

Zuko: It means my search is about to come to an end. [Iroh sighs with apparent lack of interest and places a tile on the table.] That light came from an incredibly powerful source! It has to be him!

Iroh: Or it's just the celestial lights. We've been down this road before, Prince Zuko. I don't want you to get too excited over nothing.[1]

On the surface, Iroh's reaction appears thoughtless, self-absorbed, and undiscerning. He seems more concerned about his game than about the search for the Avatar, and he fails to distinguish a profound supernatural occurrence from ordinary celestial lights. Several other early scenes are similar. Iroh can appear inept, distractible, lazy, and disengaged, very different from the energetic, focused, competent, and concerned Iroh of Book Three.

We will argue [in this post] that Iroh's Book One foolishness is a pose. Iroh's character does not fundamentally change. In Book One, he is wisely following strategies suggested by the ancient Chinese Daoist philosopher Zhuangzi for dealing with incompetent leaders. His seeming foolishness in Book One is in fact a sagacious strategy for minimizing the harm that Prince Zuko would otherwise inflict on himself and others -- a gentle touch that more effectively helps Prince Zuko find wisdom than would be possible with a more confrontational approach.

We will also present empirical evidence [in a subsequent post] that -- contrary to our expectations before collecting that evidence -- Iroh's wisdom-through-foolishness is evident to most viewers unfamiliar with the series, even on their first viewing. Viewers can immediately sense that Iroh's superficial foolishness has a deeper purpose, even if that purpose is not immediately apparent.

Iroh as a Zhuangzian Wise Fool

Like Iroh, Zhuangzi mixes jokes and misdirection with wisdom, so that it's not always clear how seriously to take him. The "Inner Chapters" of the Zhuangzi contain several obviously fictional dialogues, including one between Confucius and his favorite disciple, Yan Hui. Yan Hui asks Confucius' political advice:

I have heard that the lord of Wei is young and willful. He trifles with his state and does not acknowledge his mistakes. He is so careless with people's lives that the dead fill the state like falling leaves in a swamp. The people have nowhere to turn. I have heard you, my teacher, say, "Leave the well-governed state and go to the chaotic one. There are plenty of sick people at the doctor's door." I want to use what I have learned to think of a way the state may be saved (Kjellberg trans., p. 226-227) [2].

Zuko, like the lord of Wei in Yan Hui's telling, is a young, willful prince, leading his companions into danger, unwilling to acknowledge his mistakes. Even more so, the Fire Nation is led into peril and chaos by Fire Lord Ozai and Princess Azula. If ever a nation needed wise redirection by someone as practiced in conventional virtue as Confucius and his leading disciples, it would be the Fire Nation.

Zhuangzi's "Confucius," however, gives a very un-Confucian reply: "Sheesh! You’re just going to get yourself hurt." Through several pages of text, Yan Hui proposes various ways of dealing with misguided leaders, such as being "upright but dispassionate, energetic but not divisive" and being "inwardly straight and outwardly bending, having integrity but conforming to my superiors," but Zhuangzi's Confucius rejects all of Yan Hui's ideas. None of these conventional Confucian approaches will have any positive effect, he says. Yan Hui will just be seen as a plague and a scold, or he will provoke unproductive counterarguments, or he'll be pressured into agreeing with the leader's plans. At best, his advice will simply be ignored. Imagine a well-meaning conventional ethicist trying to persuade Zuko (in Book One), much less Ozai or Azula, to embrace peace, devoting themselves to improving the lives of ordinary people! It won’t go well.

So what should Yan Hui do, according to Zhuangzi's Confucius? He should "Fast his mind." He should be "empty" and unmoved by fame or accomplishment. "If you’re getting through, sing. If not, stop. No schools. No prescriptions. Dwell in unity and lodge in what cannot be helped, and you’re almost there." Advising another worried politician a few pages later, Zhuangzi's Confucius says:

Let yourself be carried along by things so that the mind wanders freely. Hand it all over to the unavoidable so as to nourish what is central within you. That is the most you can do. What need is there to deliberately seek any reward? The best thing is just to fulfill what’s mandated to you, your fate -- how could there be any difficulty in that?

Zhuangzi's advice is cryptic -- intentionally so, we think, so as to frustrate attempts to rigidify it into fixed doctrines. Nevertheless, we will rigidify it here, into two broadly Zhuangzian or Daoist policies for dealing with misguided rulers:

(1.) Do not attempt to pressure a misguided ruler into doing what is morally right. You'll only become noxious or be ignored. Instead, go along with what can't be helped. "Sing" -- that is, express your opinions and ideas -- only when the ruler is ready to listen.

(2.) Empty your mind of theories and doctrines, as well as desires for fame, reward, or accomplishment. These are unproductive sources of distortion, wrangling, and strife.

Despite advocating, or appearing to advocate, this two-pronged approach to dealing with misguided rulers, Zhuangzi doesn’t explicitly explain why this approach might work.

Here Avatar: The Last Airbender proves an aid to Zhuangzi interpretation. We can see how Iroh, by embodying these policies (especially in Book One), helps to redirect Zuko onto a better path. We thus gain a feel for Zhuangzian political action at work.

Iroh doesn't resist Zuko's unwise plans, except in indirect, non-threatening ways. He does suggest that Zuko relax and enjoy some tea. At one point, he redirects their ship to a trading town in search of a gaming tile. At another point, he allows himself to relax in a hot spring, delaying the departure of their ship. Despite these suggestions and redirections, he does not outright reject Zuko's quest to capture the Avatar and even helps in that quest. He does not make himself noxious to Zuko by arguing against Zuko's plans, or by parading his sagely virtue, or by advancing moral or political doctrines. Indeed, he actively undercuts whatever tendency Zuko or others might have to see him as wise (and thus noxious or threatening, judgmental or demanding) by playing the fool -- forgetful, unobservant, lazy, and excessively interested in tea and the tile game Pai Sho. In this way, Iroh keeps himself by Zuko's side, modeling peaceful humaneness and unconcern about fame, reward, wealth, or honor. He remains available to help guide Zuko in the right direction, when Zuko is ready.

A related theme in Zhuangzi is "the use of uselessness." For example, Zhuangzi celebrates the yak, big as a cloud but lacking any skill useful to humans and thus not forced into labor, and ancient, gnarled trees no good for fruit or timber and thus left in peace to live out their years. Zhuangzi's trees and yak are glorious life forms, for whom existence is enough, without further purpose. Uncle Iroh, though not wholly useless (especially in battle) and though he can devote himself to aims beyond himself (in caring for Zuko and later helping Aang restore balance to the world), possesses some of that Zhuangzian love of the useless: tea, Pai Sho, small plants and animals, which need no further justification for their existence. Through his love of the useless and simple appreciation of existence, Uncle Iroh unthreateningly models another path for Zuko, one of joyful harmony with the world.

We can distill Iroh's love of uselessness into a third piece of Zhuangzian political advice:

(3.) Don't permit yourself to become too useful. If the ruler judges you useful, you might be "cut down" like a high-quality tree and converted to a tool at the ruler's disposal. Only be useful when it's necessary to avoid becoming noxious.

Iroh is an expert firebender with years of military wisdom who could surely be a valuable asset in capturing and dispatching the Avatar if he really focused on it. However, Zuko rarely recruits Iroh's aid beyond the bare minimum, probably as a consequence of Iroh's façade of uselessness. Through conspicuous napping, laziness, and distractability, Iroh encourages Zuko and others to perceive him as a mostly harmless and not particularly valuable traveling companion.

By following Zhuangzi's first piece of advice (don't press for change before the time is right), the Daoist can stay close to a misguided leader in an unthreatening and even foolish-seeming way without provoking resistance, counterargument, or shame. By following Zhuangzi's second piece of advice (empty your mind of doctrines and striving), the Daoist models an alternative path, which the misguided leader might eventually in their own time appreciate -- perhaps more quickly than would be possible through disputation, argumentation, doctrine, intellectual engagement, or high-minded sagely posturing. By following Zhuangzi's third piece of advice (embrace uselessness), the Daoist can avoid being transformed into a disposable tool recruited for the ruler's schemes. This is Iroh's Zhuangzian approach to the transformation of Zuko.

Throughout Book One and the beginning of Book Two, we observe only three exceptions to Iroh's Zhuangzian approach. All are informative. First, Iroh is stern and directive with Zuko when instructing him in firebending. We see that Iroh is capable of opinionated command; he is not lazy and easygoing in all things. But elementary firebending appears to require no spiritual insight, so there need be no threatening moral instruction or questioning of Zuko's projects and values.

Second, Iroh gives Zuko one stern piece of advice that Zuko rejects, seemingly thus violating Policy 1. In Book One, Episode 12, Iroh warns of an approaching storm. When Zuko refuses to acknowledge the risk, Iroh urges Zuko to consider the safety of the crew. Zuko responds "The safety of the crew doesn't matter!" and continues toward the storm. When they encounter the storm and the crew complain, Iroh attempts to defuse the situation by suggesting noodles. Zuko is again offended, saying he doesn't need help keeping order on his ship. However, at the climax of the episode, when the storm is raging and the Avatar is finally in sight, Zuko chooses to let the Avatar go so that the ship can steer to safety. The viewer is, we think, invited to suppose that in making that decision Zuko is reflecting on Iroh's earlier words. Iroh's advice -- though at first seemingly ignored and irritating to Zuko, and thus un-Zhuangzian -- was well-placed after all.

Third, consider Iroh's and Zuko's split in the Book Two, Episode 5. Book Two, Episode 1 sets up the conflict. Azula has tricked Zuko into thinking that their father Ozai wants him back. In an un-Zhuangzian moment, Iroh directly, though mildly, challenges Zuko's judgment: "If Ozai wants you back, well, I think it may not be for the reasons you imagine... in our family, things are not always what they seem." This prompts Zuko's angry retort: "I think you are exactly what you seem! A lazy, mistrustful, shallow old man who's always been jealous of his brother!"

The immediate cause of their split seems trivial. They have survived briefly together as impoverished refugees when Zuko suddenly presents Iroh with delicious food and a fancy teapot. Iroh enjoys the food but asks where it came from, and he opines that tea is just as delicious in cheap tin as in fancy porcelain. Zuko refuses to reveal how he acquired the goods. Iroh is remarkably gentle in response, saying only that poverty is nothing to be ashamed of and noting that their troubles are now so deep that even finding the Avatar would not resolve them. When Zuko replies that therefore there is no hope, Iroh answers:

You must never give in to despair. Allow yourself to slip down that road and you surrender to your lowest instincts. In the darkest times, hope is something you give yourself. That is the meaning of inner strength.

A bit of sagely advice, kindly delivered? This is the next we see of Zuko and Iroh:

Zuko: Uncle ... I thought a lot about what you said.

Iroh: You did? Good, good.

Zuko: It's helped me realize something. We no longer have anything to gain by traveling together. I need to find my own way.

Zuko's and Iroh's falling out reinforces the Zhuangzian message of Avatar: The Last Airbender. As soon as Iroh deviates from the first of the three Zhuangzian policies -- as soon as he challenges Zuko's morality and starts offering sagely advice, however gently -- Zuko reacts badly, rejecting both the advice and Iroh himself.

Zuko and Iroh of course later reunite and Zuko eventually transforms himself under the influence of Iroh, with Iroh becoming more willing to advise Zuko and dispense explicit wisdom, in proportion to Zuko's readiness for that advice and wisdom. Apart from his un-Zhuangzian moments in the first part of Book Two, Iroh "sings" only when he is getting through, just as Zhuangzi's "Confucius" advises. Otherwise, Iroh acts by joke, misdirection, and a clownishly unthreatening modeling of peaceful humaneness and unconcern.

How might a Zhuangzian Daoist might effectively interact with a misguided ruler? Iroh’s interactions with Zuko throughout Book One serve as illustration.


[1] All transcripts are adapted from

[2] All translations are Kjellberg's, with some minor modifications, from Ivanhoe and Van Norden's Readings in Classical Chinese Philosophy, 2nd ed.

[image source]

Wednesday, November 24, 2021

What Is Belief? Call for Abstracts (£2,000 award)

December 1 deadline coming up in one week!

Reposting from Sep 6:

What Is Belief? Call for Abstract Submissions 

Editors: Eric Schwitzgebel (Department of Philosophy, University of California, Riverside); Jonathan Jong (Centre for Trust, Peace and Social Relations, Coventry University)

We are inviting abstract submissions for a volume of collected essays on the question "What is belief?". Each essay will propose a definition and theory of belief, setting out criteria for what constitutes belief. Candidate criteria might include, for example, causal history, functional or inferential role, representational structure, correctness conditions, availability to consciousness, responsiveness to evidence, situational stability, or resistance to volitional change.

Each essay should also at least briefly address the following questions:

(1.) How does belief differ from other related mental states (e.g., acceptance, imagination, assumption, judgment, credence, faith, or guessing)?

(2.) How does the proposed theory handle "edge cases" or controversial cases (e.g., delusions, religious credences, implicit biases, self-deception, know-how, awareness of swiftly forgotten perceptual details)?

Although not required, some preference will be given to those that also address:

(3.) What empirical support, if any, is there for the proposed theory of belief? What empirical tests or predictions might provide further support?

(4.) What practical implications follow from accepting the proposed theory of belief as opposed to competitor theories?

The deadline for abstracts (< 1,000 words) is December 1, 2021.

Applicants selected to contribute to the volume will be awarded £2,000 (essay length 6,000-10,000 words) by February 1, 2023. The essay will then undergo a peer review process prior to publication.  Funded by the Templeton Foundation.

For more information and to submit abstracts, email eschwitz at domain ucr dot edu.

[image source]