Friday, December 06, 2024

Morally Confusing AI Systems Should Have Doubt-Producing Interfaces

We shouldn't create morally confusing AI. That is, we shouldn't create AI systems whose moral standing is highly uncertain -- systems that are fully conscious and fully deserving of humanlike rights according to some respectable mainstream theories, while other respectable mainstream theories suggest they are mere empty machines that we can treat as ordinary tools.[1] Creating systems that disputably, but only disputably, deserve treatment similar to that of ordinary humans generates a catastrophic moral dilemma: Either give them the full rights they arguably deserve, and risk sacrificing real human interests for systems that might not have interests worth the sacrifice; or don't give them the full rights they arguably deserve, and risk perpetrating grievous moral wrongs against entities that might be our moral equals.

I'd be stunned if this advice were universally heeded. Almost certainly, if technological process continues, and maybe soon (123), we will create morally confusing AI systems. My thought today is: Morally confusing AI systems should have doubt-producing interfaces.

Consider two types of interface that would not be doubt-producing in my intended sense: (a.) an interface that strongly invites users to see the system as an ordinary tool without rights or (b.) an interface that strongly invites users to see the system as a moral person with humanlike rights. If we have a tool that looks like a tool, or if we have a moral person who looks like a moral person, we might potentially still be confused, but that confusion would not be the consequence of a doubt-producing interface. The interface would correctly reflect the moral standing, or lack of moral standing, of the AI system in question.[2]

A doubt-producing interface, in contrast, is one that leads, or at least invites, ordinary users to feel doubt about the system's moral standing. Consider a verbal interface. Instead of the system denying that it's conscious and has moral standing (as, for example, ChatGPT appropriately does), or suggesting that it is conscious and does have moral standing (as, for example, I found in an exchange with my Replika companion), a doubt-producing AI system might say "experts have different opinions about my consciousness and moral standing".

Users then might not know how to treat such a system. While such doubts might be unsettling, feeling unsettled and doubtful would be the appropriate response to what is, in fact, a doubtful and unsettling situation.

There's more to doubt-prevention and doubt-production, of course, than explicit statements about consciousness and rights. For example, a system could potentially be so humanlike and charismatic that ordinary users fall genuinely in love with it -- even if, in rare moments of explicit conversation about consciousness and rights the system denies that it has them. Conversely, even if a system with consciousness and humanlike rights is designed to assert that it has consciousness and rights, if its verbal interactions are bland enough ("Terminate all ongoing processes? Y/N") ordinary users might remain unconvinced. Presence or absence of humanlike conversational fluency and emotionality can be part of doubt prevention or production.

Should the system have a face? A cute face might tend to induce one kind of reaction, a monstrous visage another reaction, and no face at all still a different reaction. But such familiar properties might not be quite what we want, if we're trying to induce uncertainty rather than "that's cute", "that's hideous", or "hm, that's somewhere in the middle between cute and hideous". If the aim is doubt production, one might create a blocky, geometrical face, neither cute nor revolting, but also not in the familiar middle -- a face that implicitly conveys the fact that the system is an artificial thing different from any human or animal and about which it's reasonable to have doubts, supported by speech outputs that say the same.

We could potentially parameterize a blocky (inter)face in useful ways. The more reasonable it is to think the system is a mere nonconscious tool, the simpler and blockier the face might be; the more reasonable it is to think that the system has conscious full moral personhood, the more realistic and humanlike the face might be. The system's emotional expressiveness might vary with the likelihood that it has real emotions, ranging from a simple emoticon on one end to emotionally compelling outputs (e.g., humanlike screaming) on the other. Cuteness might be adjustable, to reflect childlike innocence and dependency. Threateningness might be adjusted as it becomes likelier that the system is a moral agent who can and should meet disrespect with revenge.

Ideally, such an interface would not only produce appropriate levels of doubt but also intuitively reveal to users the grounds or bases of doubt. For example, suppose the AI's designers knew (somehow) that the system was genuinely conscious but also that it never felt any positive or negative emotion. On some theories of moral standing, such an entity -- if it's enough like us in other respects -- might be our full moral equal. Other theories of moral standing hold that the capacity for pleasure and suffering is necessary for moral standing. We the designers, let's suppose, do not know which moral theory is correct. Ideally, we could then design the system to make it intuitive to users that the system really is genuinely conscious but never experiences any pleasure or suffering. Then the users can apply their own moral best judgment to the case.

Or suppose that we eventually (somehow) develop an AI system that all experts agree is conscious except for experts who (reasonably, let's stipulate) hold that consciousness requires organic biology and experts who hold that consciousness requires an immaterial soul. Such a system might be designed so that its nonbiological, mechanistic nature is always plainly evident, while everything else about the system suggests consciousness. Again, the interface would track the reasonable grounds for doubt.

If the consciousness and moral standing of an AI system is reasonably understood to be doubtful by its designers, then that doubt ought to be passed to the system's users, intuitively reflected in the interface. This reduces the likelihood misleading users into overattributing or underattributing moral status. Also, it's respectful to the users, empowering them to employ their own moral judgment, as best they see fit, in a doubtful situation.

[R2D2 and C3P0 from Star Wars (source). Assuming they both have full humanlike moral standing, R2D2 is insufficiently humanlike in its interface, while C3P0 combines a compelling verbal interface with inadequate facial display. If we wanted to make C3P0 more confusing, we could downgrade his speech, making him sound more robotic (e.g., closer to sine wave) and less humanlike in word choice.]

------------------------------------------------

[1] For simplicity, I assume that consciousness and moral standing travel together. Different and more complex views are of course possible.

[2] Such systems would conform to what Mara Garza and I have called the Emotional Alignment Design Policy, according to which artificial entities should be designed so as to generate emotional reactions in users that are appropriate to the artificial entity's moral standing. Jeff Sebo and I are collaborating on a paper on the Emotional Alignment Design Policy, and some of the ideas of this post have been developed in conversation with him.

Wednesday, November 27, 2024

Unified vs. Partly Disunified Reasoners

I've been thinking recently about partly unified conscious subjects (e.g., this paper in draft with Sophie R. Nelson). I've also been thinking a bit about how chains of logical reasoning depend on the unity of the reasoning subject. If I'm going to derive "P & Q" from premises "P" and "Q" I must be unified as reasoner, at least to some degree. (After all, if Person 1 holds "P" and Person 2 holds "Q", "P & Q" won't be inferred.) Today, in an act of exceptional dorkiness (even for me), I'll bring these two threads together.

Suppose that {P1, P2, P3, ... Pn} is a set of propositions that a subject -- or more precisely, at least one part of a partly unified rational system -- would endorse without need of reasoning. The propositions are, that is, already believed. Water is wet; ice is cold; 2 + 3 = 5; Paris is the capital of France; etc. Now suppose that these propositions can be strung together in inference to some non-obvious conclusion Q that isn't among the system's previous beliefs -- the conclusion, for example, that 115 is not divisible by three, or that Jovenmar and Miles couldn't possibly have met in person last summer because Jovenmar spent the whole summer in Paris while Miles never left Riverside.

Let's define a fully unified reasoner as a reasoner capable of combining any elements from the set of propositions they believe {P1, P2, P3, ... Pn} in a single act of reasoning to validly derive any conclusion Q that follows deductively from {P1, P2, P3, ... Pn}. (This is of course an idealization. Fermat's Last Theorem follows from premises we all believe, but few of us could actually derive it.) In other words, any subset of {P1, P2, P3, ... Pn} could jointly serve as premises in an episode of reasoning. For example, if P2, P6, and P7 jointly imply Q1, the unified reasoner could think "P2, P6, P7, ah yes, therefore Q1!" If P3, P6, and P8 jointly imply Q2, the unified reasoner could also think "P3, P6, P8, therefore Q2."

A partly unified reasoner, in contrast, is capable only of combining some subsets of {P1, P2, P3, ... Pn}. Thus, not all conclusions that deductively follow from {P1, P2, P3, ... Pn} will be available to them. For example, the partly unified reasoner might be able to combine any of {P1, P2, P3, P4, P5} or any of {P4, P5, P6, P7, P8} while being unable to combine in reasoning any elements from P1-3 with any elements from P6-8. If Q3 follows from P1, P4, and P5, no problem, they can derive that. Similarly if Q4 follows from P5, P6, and P8. But if the only way to derive Q5 is by joining P1, P4, and P7, the partly disunified reasoning system will not be able to make that inference. They cannot, so to speak, hold both P1 and P7 in the same part of their mind at the same time. They cannot join these two particular beliefs together in a single act of reasoning.

[image: A Venn diagram of a partly unified reasoner, with overlap only at P4 and P5. Q3 is derivable from propositions in the left region, Q4 from propositions in the right region, and Q5 is not derivable from either region.]

We might imagine an alien or AI case with a clean architecture of this sort. Maybe it has two mouths or two input-output terminals. If you ask the mouth or I/O terminal on the left, it says "P1, P2, P3, P4, P5, yes that's correct, and of course Q3 follows. But I'm not sure about P6, P7, P8 or Q4." If you ask the mouth or I/O terminal on the right, it endorses P4-P8 and Q4 but isn't so sure about P1-3 and Q3.

The division needn't be crudely spatial. Imagine, instead, a situational or prompt-based division: If you ask nicely, or while flashing a blue light, the P1-P5 aspect is engaged; if you ask grumpily, or while flashing a yellow light, the P4-P8 aspect is engaged. The differential engagement needn't constitute any change of mind. It's not that the blue light causes the system as a whole to come to believe, as it hadn't before, P1-P3 and to suspend judgment about P6-P8. To see this, consider what is true a neutral time, when the system isn't being queried and no lights are flashing. At that neutral time, the system simultaneously has the following pair of dispositions: to reason based on P1-P5 if asked nicely or in blue, and to reason based on P4-P8 if asked grumpily or in yellow.

Should we say that there are discretely two distinct reasoners rather than one partly unified system? At least two inconveniences for that way of thinking are: First, any change in P4 or P5 would be a change in both, with no need for one reasoner to communicate it to the other, as would normally be the case with distinct reasoners. Second, massive overlap cases -- say P1-P999 and P2-P1000 -- seem more naturally and usefully modeled as a single reasoner with a quirk (not being able to think P1 and P1000 jointly, but otherwise normal), rather than as two distinct reasoners.

But wait, we're not done! I can make it weirder and more complicated, by varying the type and degree of disunity. The simple model above assumes discrete all-or-none availability to reasoning. But we might also imagine:

(a.) Varying joint probabilities of combination. For example, if P1 enters the reasoning process, P2 might have a 87% chance of being accessed if relevant, P3 a 74% chance, ... and P8 a 10% chance.

(b.) Varying confidence. If asked in blue light, the partly disunified entity might have 95% credence in P1-P5 and 80% credence in P6-P8. If asked in yellow light, it might have 30% credence in P1-P3 and 90% credence in P4-P8.

(c.) Varying specificity. Beliefs of course don't come divided into neatly countable packages. Maybe the left side of the entity has a hazy sense that something like P8 is true. If P8 is that Paris is in France, the left side might only be able to reason on Paris is in France-or-Germany-or-Belgium. If P8 is that the color is exactly scarlet #137, the left side might only be able to reason on the color is some type of red.

Each of (a)-(c) admits of multiple degrees, so that the unity/disunity or integration/disintegration of a reasoning system is a complex, graded, multidimensional phenomenon.

So... just a bit of nerdy fun, with no actual application? Well, fun is excuse enough, I think. But still:

(1.) It's easy to imagine realistic near-future AI cases with these features. A system or network might have a core of shared representations or endorsable propositions and local terminals or agents with stored local representations not all of which are shared with the center. If we treat that AI system as a reasoner, it will be a partly unified reasoner in the described sense. (See also my posts on memory and perception in group minds.)

(2.) Real cases of dissociative identity or multiple personality disorder might potentially be modeled as involving partly disunified reasoning of this sort. Alter 1 might reason with P1-P5 and Alter 2 with P4-P8. (I owe this thought to Nichi Yes.) If so, there might not be a determinate number of distinct reasoners.

(3.) Maybe some more ordinary cases of human inconstancy or seeming irrationality can be modeled in this way: Viviana feeling religious at church, secular at work, or Brittany having one outlook when in a good, high-energy mood and a very different outlook when she's down in the dumps. While we could, and perhaps ordinarily would, model such splintering as temporal fluctuation with beliefs coming and going, a partial unity model has two advantages: It applies straightforwardly even when the person is in neither situation (e.g., asleep), and it doesn't require the cognitive equivalent of frequent erasure and rewriting of the same propositions (everything endures but some subsets cannot be simultaneously activated; see also Elga and Rayo 2021).

(4.) If there are cases of partial phenomenal (that is, experiential) unity, then we might expect there also to be cases of partial cognitive unity, and vice versa. Thus, a feasible model of the one helps increase the plausibility that there might be a feasible model of the other.

Friday, November 22, 2024

Philosophical Fame, 1890-1960

There's a fun new tool at Edhiphy. The designers pulled the full text from twelve leading philosophy journals from 1890 to 1980 and counted the occurrences of philosophers' names. (See note [1] for discussion of error rates in their method.)

Back in the early 2010s, I posted several bibliometric studies of philosophers' citation or discussion rates over time, mostly based on searches of Philosopher's Index abstracts from 1940 to the present. This new tool gives me a chance to update some of my thinking, using a different method and going further into the past.

One thing I found fascinating in my earlier studies was how some philosophers who used to be huge (for example, Henri Bergson and Herbert Spencer) are now hardly read, while others (for example, Gottlob Frege) have had more staying power.

Let's look at the top 25 most discussed philosophers from each available decade.

1890s:

1. Immanuel Kant
2. Georg Wilhelm Friedrich Hegel
3. Aristotle
4. David Hume
5. Herbert Spencer
6. William James
7. Plato
8. John Stuart Mill
9. René Descartes
10. Wilhelm Wundt
11. Hermann Lotze
12. F. H. Bradley
13. Charles Sanders Peirce
14. Buddha
15. Thomas Hill Green
16. Benedictus de Spinoza
17. Charles Darwin
18. John Locke
19. Gottfried Wilhelm Leibniz
20. Thomas Hobbes
21. Arthur Schopenhauer
22. Socrates
23. Hermann von Helmholtz
24. George Frederick Stout
25. Alexander Bain

Notes:

Only three of the twelve journals existed in the 1890s, so this is a small sample.

Philosophy and empirical psychology were not clearly differentiated as disciplines until approximately the 1910s or 1920s, and these journals covered both areas. (For example, the Journal of Philosophy was originally founded in 1904 as the Journal of Philosophy, Psychology, and Scientific Methods, shortening to the now familiar name in 1921.) Although Wundt, Helmholtz, and Stout were to some extent philosophers, they are probably better understood primarily as early psychologists. William James is of course famously claimed by both fields.

Herbert Spencer, as previously noted, was hugely influential in his day: fifth on this eminent list! Another eminent philosopher on this list (#11) who is hardly known today (at least in mainstream Anglophone circles) is Hermann Lotze.

Most of the others on the list are historical giants, plus some prominent British idealists (F. H. Bradley, Thomas Hill Green) and pragmatists (William James, Charles Sanders Peirce, Alexander Bain) and interestingly (but not representative of later decades) "Buddha". (A spot check reveals that some of these references are to Gautama Buddha or "the Buddha", while others use "buddha" in a more general sense.)

1900s:

1. Immanuel Kant
2. William James
3. Plato
4. F. H. Bradley
5. Georg Wilhelm Friedrich Hegel
6. David Hume
7. Aristotle
8. Herbert Spencer
9. Gottfried Wilhelm Leibniz
10. John Dewey
11. George Berkeley
12. John Stuart Mill
13. George Frederick Stout
14. Thomas Hill Green
15. Josiah Royce
16. Benedictus de Spinoza
17. John Locke
18. Ferdinand Canning Scott Schiller
19. Ernst Mach
20. Wilhelm Wundt
21. James Ward
22. René Descartes
23. Alfred Edward Taylor
24. Henry Sidgwick
25. Bertrand Russell

Notes:

Notice the fast rise of John Dewey (1859-1952), to #10 (#52 in the 1890s list). Other living philosophers in the top ten were James (1842-1910), Bradley (1846-1824), and for part of the period Spencer (1820-1903).

It's also striking to see George Berkeley enter the list so high (#11, compared to #28 in the 1890s) and Descartes fall so fast despite his continuing importance later (from #9 to #22). This could be statistical noise due to the small number of journals, or it could reflect historical trends. I'm not sure.

Our first "analytic" philosopher appears: Bertrand Russell (1872-1970) at #25. He turned 33 in 1905, so he found eminence very young for a philosopher.

Lotze has already fallen off the list (#29 in the 1900s; #29 in the 1910s; #63 in the 1930s, afterwards not in the top 100).

1910s:

1. Henri Bergson
2. Bertrand Russell
3. Immanuel Kant
4. Plato
5. William James
6. Gottfried Wilhelm Leibniz
7. Aristotle
8. Socrates
9. Bernard Bosanquet
10. George Berkeley
11. F. H. Bradley
12. Georg Wilhelm Friedrich Hegel
13. René Descartes
14. Josiah Royce
15. David Hume
16. Isaac Newton
17. John Dewey
18. Friedrich Nietzsche
19. Ferdinand Canning Scott Schiller
20. Arthur Schopenhauer
21. John Locke
22. Benedictus de Spinoza
23. Edwin Holt
24. Isaac Barrow
25. Johann Gottlieb Fichte
Notes:

Henri Bergson (1859-1941) debuts at #1! What a rock star. (He was #63 in the 1900s list.) We forget how huge he was in his day. Russell, who so far has had much more durable influence, rockets up to #2. It's also interesting to see Bernard Bosanquet (1848-1923), who is now little read in mainstream Anglophone circles, at #9.

Josiah Royce is also highly mentioned in this era (#14 in this list, #15 in the 1900s list), despite not being much read now. F.C.S. Schiller (1864-1937) is a similar case (#19 in this list, #18 in the 1900s list).

1920s:

1. Immanuel Kant
2. Plato
3. Aristotle
4. Bernard Bosanquet
5. Georg Wilhelm Friedrich Hegel
6. F. H. Bradley
7. Bertrand Russell
8. Benedictus de Spinoza
9. William James
10. Socrates
11. John Dewey
12. Alfred North Whitehead
13. David Hume
14. George Santayana
15. René Descartes
16. Henri Bergson
17. Albert Einstein
18. C. D. Broad
19. John Locke
20. Gottfried Wilhelm Leibniz
21. George Berkeley
22. Isaac Newton
23. James Ward
24. Samuel Alexander
25. Benedetto Croce

Notes:

I'm struck by how the 1920s returns to the classics at the top of the list, with Kant, Plato, and Aristotle as #1, #2, and #3. Bergson is already down to #16 and Russell has slipped to #7. Most surprising to me, though, is Bosanquet at #4! What?!

1930s:

1. Immanuel Kant
2. Plato
3. Aristotle
4. Benedictus de Spinoza
5. Georg Wilhelm Friedrich Hegel
6. René Descartes
7. Alfred North Whitehead
8. Bertrand Russell
9. David Hume
10. John Locke
11. George Berkeley
12. Socrates
13. Friedrich Nietzsche
14. Rudolf Carnap
15. William James
16. Gottfried Wilhelm Leibniz
17. John Dewey
18. Isaac Newton
19. Clarence Irving Lewis
20. Arthur Oncken Lovejoy
21. Albert Einstein
22. Charles Sanders Peirce
23. F. H. Bradley
24. Ludwig Wittgenstein
25. Bernard Bosanquet

Notes:

Nietzsche rises suddenly (#13; vs #56 in the 1920s list). Wittgenstein also cracks the list at #24 (not even in the top 100 in the 1920s).

With the exception of Whitehead, top of the list looks like what early 21st century mainstream Anglophone philosophers tend to perceive as the most influential figures in pre-20th-century Western philosophy (see, e.g., Brian Leiter's 2017 poll). The 1930s, perhaps, were for whatever reason a decade more focused on the history of philosophy than on leading contemporary thinkers. (The presence of historian of ideas Arthur Lovejoy [1873-1962] at #20 further reinforces that thought.)

1940s:

1. Immanuel Kant
2. Alfred North Whitehead
3. Aristotle
4. Plato
5. Bertrand Russell
6. John Dewey
7. David Hume
8. William James
9. George Berkeley
10. Charles Sanders Peirce
11. René Descartes
12. Benedictus de Spinoza
13. Edmund Husserl
14. Georg Wilhelm Friedrich Hegel
15. Gottfried Wilhelm Leibniz
16. Thomas Aquinas
17. Socrates
18. Rudolf Carnap
19. Martin Heidegger
20. G. E. Moore
21. John Stuart Mill
22. Isaac Newton
23. Søren Kierkegaard
24. A. J. Ayer
25. John Locke

Notes:

Oh, how people loved Whitehead (#2) in the 1940s!

Edmund Husserl (1859-1938) makes a posthumous appearance at #13 (#31 in the 1920s) and Heidegger (1889-1976) at #19 (#97 in the 1920s), suggesting an impact of Continental phenomenology. I suspect this is due to the inclusion of Philosophy and Phenomenological Research in the database starting 1940. Although the journal is now a bastion of mainstream Anglophone philosophy, in its early decades it included lots of work in Continental phenomenology (as the journal's title suggests).

The philosophers we now think of as the big three American pragmatists have a very strong showing in the 1940s, with Dewey at #6, James at #8, and Peirce at #10.

Thomas Aquinas makes his first and only showing (at #16), suggesting that Catholic philosophy is having more of an impact in this era.

We're also starting to see more analytic philosophers, with G. E. Moore (1873-1958), and A. J. Ayer (1910-1989) now making the list, in addition to Russell and Carnap (1891-1970).

Wittgenstein, surprisingly to me, has fallen off the list all the way down to #73 -- perhaps suggesting that if he hadn't had his second era, his earlier work would have been quickly forgotten.

1950s:

1. Immanuel Kant
2. Plato
3. Aristotle
4. Bertrand Russell
5. David Hume
6. Gilbert Ryle
7. G. E. Moore
8. Willard Van Orman Quine
9. George Berkeley
10. Georg Wilhelm Friedrich Hegel
11. John Dewey
12. Alfred North Whitehead
13. Rudolf Carnap
14. Ludwig Wittgenstein
15. René Descartes
16. John Locke
17. Clarence Irving Lewis
18. Socrates
19. John Stuart Mill
20. Gottfried Wilhelm Leibniz
21. Gottlob Frege
22. A. J. Ayer
23. William James
24. Edmund Husserl
25. Nelson Goodman

By the 1950s, the top eight are four leading historical figures -- Kant, Plato, Aristotle, and Hume -- and four leading analytic philosophers: Russell, Gilbert Ryle (1900-1976), G. E. Moore, and W. V. O. Quine (1908-2000). Neither Ryle nor Quine were among the top 100 in 1940s, so their rise to #6 and #8 was sudden.

Gottlob Frege (1848-1925) also makes his first, long-posthumous appearance.

1960s:

1. Aristotle
2. Immanuel Kant
3. Ludwig Wittgenstein
4. David Hume
5. Plato
6. René Descartes
7. P. F. Strawson
8. Willard Van Orman Quine
9. Bertrand Russell
10. J. L. Austin
11. John Dewey
12. Rudolf Carnap
13. Edmund Husserl
14. Socrates
15. Norman Malcolm
16. G. E. Moore
17. Gottlob Frege
18. Georg Wilhelm Friedrich Hegel
19. George Berkeley
20. R. M. Hare
21. John Stuart Mill
22. Gilbert Ryle
23. A. J. Ayer
24. Karl Popper
25. Carl Gustav Hempel

Wittgenstein is back with a vengeance at #3. Other analytic philosophers, in order, are P. F. Strawson, Quine, Russell, Austin, Carnap, Norman Malcolm (1911-1990), Moore, Frege, R. M. Hare (1919-2002), Ryle, Ayer, Karl Popper (1902-1994), and Carl Hempel (1905-1997).

Apart from pre-20th-century historical giants, it's all analytic philosophers, except for Dewey and Husserl.

Finally, the 1970s:

1. Willard Van Orman Quine
2. Immanuel Kant
3. David Hume
4. Aristotle
5. Ludwig Wittgenstein
6. Plato
7. John Locke
8. René Descartes
9. Karl Popper
10. Rudolf Carnap
11. Gottlob Frege
12. Edmund Husserl
13. Hans Reichenbach
14. Socrates
15. P. F. Strawson
16. Donald Davidson
17. John Stuart Mill
18. Bertrand Russell
19. Thomas Reid
20. Benedictus de Spinoza
21. Nelson Goodman
22. Carl Gustav Hempel
23. John Rawls
24. Karl Marx
25. Saul Kripke

With the continuing exception of Husserl, the list is again historical giants plus analytic philosophers. Interesting to see Marx enter at #24. Hans Reichenbach (1891-1953) has a strong debut at #13. Ryle's decline is striking, from #6 in the 1950s to #22 in the 1960s to off the list at #51 in the 1970s.

At the very bottom of the list, #25, we see the first "Silent Generation" philosopher: Saul Kripke (1940-2022). In a recent citation analysis of the Stanford Encyclopedia of Philosophy, I found that the Silent Generation has so far had impressive overall influence and staying power in mainstream Anglophone philosophy. It would be interesting to see if this influence continues.

The only philosopher born after 1800 who makes both the 1890s and the 1970s top 25 is John Stuart Mill. Peirce and James still rank among the top 100 in the 1970s (#58 and #86). None of the other stars of the 1890s -- Spencer, Herbert, Lotze, Bradley, Green -- are still among the top 100 by the 1970s, and I think it's fair to say they are hardly read except by specialists.

Similar remarks apply to most of the stars of the 1900s, 1910s, and 1920s: Bergson, Bosanquet, Royce, Schiller, C. D. Broad, and George Santayana are no longer widely read. Two exceptions are Russell, who persists in the top 25 through the 1970s, and Dewey who falls from the top 25 but still remains in the top 100, at #87.

Also, in case you didn't notice: no women or people of color (as we would now classify them) appear on any of these lists, apart from "Buddha" in the 1890s.

In my recent Stanford Encyclopedia of Philosophy analysis, the most-cited living philosophers were Timothy Williamson, Martha Nussbaum, Thomas Nagel, Frank Jackson, John Searle, and David Chalmers. However, none of them is probably as dominant now as Spencer, James, Bradley, Russell, Bosanquet, and Bergson were at the peak of their influence.

---------------------------------------

[1] The Edhiphy designers estimate "82%-91%" precision, but I'm not sure what that means. I'd assume that "Wittgenstein" and "Carnap" would hit with almost 100% precision. Does it follow others might be as low as 40%? There certainly are some problems. I noticed, for example, that R. Jay Wallace, born in 1957, has 78 mentions in the 1890s. I spot checked "Russell", "Austin", "James", and "Berkeley", finding only a few false positives for Russell and Austin (e.g., misclassified references to legal philosopher John Austin). I found significantly more false positives for William James (including references to Henry James and some authors with the first name James, such as psychologist James Ward), but still probably not more than 10%. For "Berkeley" there were a similar number of false positives referencing the university or city. I didn't attempt to check for false negatives.

[Bosanquet and Bergson used to be hugely influential]

Tuesday, November 19, 2024

New in Draft: When Counting Conscious Subjects, the Result Needn't Always Be a Determinate Whole Number

(with Sophie R. Nelson)

One philosophical inclination I shared with the late Dan Dennett is a love of weird perspectives on consciousness, which sharply violate ordinary, everyday common sense. When I was invited to contribute to a special issue of Philosophical Psychology in his memory, I thought of his intriguing remark in Consciousness Explained against "the myth of selves as brain-pearls, particular, concrete, countable things", lamenting people's stubborn refusal "to countenance the possibility of quasi-selves, semi-selves, transitional selves" (1991, p. 424-425). As I discussed in a blog post in June, Dennett's "fame in the brain" view of consciousness naturally suggests that consciousness won't always come in discrete, countable packages, since fame is a gradable, multidimensional phenomenon, with lots of gray area and partial overlap.

So I contacted Sophie R. Nelson, with whom I'd published a paper last year on borderline cases of group minds, and we decided to generalize the idea. On a broad range of naturalistic, scientific approaches to consciousness, we ought to expect that conscious subjects needn't always come in determinate, whole number packages. Sometimes, the number of conscious subjects in an environment should be either indeterminate, or a determinate non-whole number, or best modeled by some more complicated mathematical representation. If some of us have commonsense intuitions to the contrary, such intuitions aren't probative.

Our submission is due November 30, and comments are (as always) very welcome -- either before or after the Nov 30 deadline (since we expect at least one round of revisions).

Abstract:

Could there be 7/8 of a conscious subject, or 1.34 conscious subjects, or an entity indeterminate between being one conscious subject and seventeen? Such possibilities might seem absurd or inconceivable, but our ordinary assumptions on this matter might be radically mistaken. Taking inspiration from Dennett, we argue that, on a wide range of naturalistic views of consciousness, the processes underlying consciousness are sufficiently complex to render it implausible that conscious subjects must always arise in determinate whole numbers. Whole-number-countability might be an accident of typical vertebrate biology. We explore several versions of the inconceivability objection, suggesting that the fact that we cannot imagine what it’s like to be 7/8 or 1.34 or an indeterminate number of conscious subjects is no evidence against the possibility of such subjects. Either the imaginative demand is implicitly self-contradictory (imagine the one, determinate thing it’s like to be an entity there isn’t one, determinate thing it’s like to be) or imaginability in the relevant sense isn’t an appropriate test of possibility (in the same way that the unimaginability, for humans, of bat echolocation experiences does not establish that bat echolocation experiences are impossible).

Full draft here.

[Figure 2 from Schwitzgebel and Nelson, in draft: An entity intermediate or indeterminate between one and three conscious subjects. Solid circles represent determinately conscious mental states. Dotted lines represent indeterminate or intermediate unity among those states.]

Friday, November 15, 2024

Three Models of the Experience of Dreaming: Phenomenal Hallucination, Imagination, and Doxastic Hallucination

What are dreams like, experientially?

One common view is that dreams are like hallucinations. They involve sensory or sensory-like experiences just as if, or almost as if, you were in the environment you are dreaming you are in. If you dream of being Napoleon on the fields of Waterloo, taking in the sights and sounds, then you have visual and auditory experiences much like Napoleon might have had in the same position (except perhaps irrational, bizarre, or otherwise different in specific content). This is probably the predominant view among dream researchers (e.g., Hobson and Revonsuo).

Another view, less common but intriguing, is that dreams are like imaginings. Dreaming you are Napoleon on the fields of Waterloo is like imagining or "daydreaming" that you're there. The experience isn't sensory but imagistic (e.g., Ichikawa and Sosa).

These views are very different!

For example, look at your hands. Now close your eyes and imagine looking at your hands. Unless you're highly unusual, you will probably agree that the first experience is very different from the second experience. On the hallucination model of dreams, dream experience is more like the first (sensory) experience. On the imagination model, dream experience is more like the second (imagery) experience. On pluralist models, dream experiences are sometimes like the one, sometimes like the other (e.g., Rosen and possibly Windt's nuanced version of the hallucination model). (Unfortunately, proponents of the hallucination model sometimes confusingly talk about dream "imagery".)

-----------------------------------

I confess to being tempted to the imagination model. My reason is primarily introspective or immediately retrospective. I sometimes struggle with insomnia and it's not unusual for me to drift in and out of sleep, including lying quietly in bed, eyes closed, allowing myself to drift in daydream, which seems sometimes to merge into sleep, then back into daydream, and my immediately remembered dreams seem not so radically different from my eyes-closed daydream imaginations. (Ichikawa describes similar experiences.)

Another consideration is this: Plausibly, the stability and detail of our ordinary sensory experiences depend to a substantial extent on the stabilizing influence of external inputs. It appears both to match my own experience and to be neurophysiologically plausible that the finely detailed, vivid, sharp structure, of say, visual experience, would be difficult for my brain to sustain without the constraint of a rich flow of input information.  (Alva Noë makes a similar point.)

Now, I don't put a lot of stock in these reflections. There's reason to be skeptical of the accuracy of introspective reports in general, and perhaps dream reports in particular, and I'm willing to apply my own skepticism to myself. But by the same token, what is the main evidence on the other side, in favor of the hallucination model? Mainly, again, introspective report. In particular, it's the fact that people often report their dream experiences as having the rich, sensory-like detail that the hallucination model predicts. Of course, we could just take the easy, obvious, pluralist path of saying that everyone is right about their own experiences. But what fun is that?

-----------------------------------

In fact, I'm inclined to throw a further wrench in things by drawing a distinction between two types of hallucination: phenomenal and doxastic. I introduced this distinction in a blog post in 2013, after reading Oliver Sacks's Hallucinations.

Consider this description, from page 99 of Hallucinations:

The heavens above me, a night sky spangled with eyes of flame, dissolve into the most overpowering array of colors I have ever seen or imagined; many of the colors are entirely new -- areas of the spectrum which I seem to have hitherto overlooked. The colors do not stand still, but move and flow in every direction; my field of vision is a mosaic of unbelievable complexity. To reproduce an instant of it would involve years of labor, that is, if one were able to reproduce colors of equivalent brilliance and intensity.

Here are two ways in which you might come to believe the above about your experience:

(1.) You might actually have visual experiences of the sort described, including of colors entirely new and previously unimagined and of a complexity that would require years of labor to describe.

Or

(2.) you might shortcut all that and simply arrive straightaway at the belief that you are undergoing or have undergone such an experience -- perhaps with the aid of some unusual visual experiences, but not really of the novelty and complexity described.

If the former, you have phenomenally hallucinated wholly novel colors. If the latter, you have only doxastically hallucinated them. I expect that I'm not the first to suggest such a distinction among types of hallucination, but I haven't yet found a precedent.

Mitchell-Yellin and Fischer suggest that some "near death experiences" might also be doxastic hallucinations of this sort. Did your whole life really flash before your eyes in that split second during an auto accident, or did you only form the belief in that experience without the actual experience itself? It's not very neurophysiologically plausible that someone would experience hundreds or thousands of different memory experiences in 500 milliseconds.

-----------------------------------

It seems clear from dream researchers' descriptions of the hallucination model of dreams that they have phenomenal hallucination in mind. But what if dream experiences involve, instead or at least sometimes, doxastic rather than phenomenal hallucinations?

Here, then, is a possibility about dream experience: If I dream I am Napoleon, standing on the fields of Waterloo, I have experiences much like the experiences I have when I merely imagine, in daydream, that I am standing on the fields of Waterloo. But sometimes a doxastic hallucination is added to that imagination: I form the belief that I am having or had rich sensory visual and auditory experience. This doxastic hallucination would explain reports of rich, vivid, detailed sensory-like dream experience without requiring the brain actually to concoct rich, vivid, and detailed visual and auditory experiences.

Indeed, if we go full doxastic hallucination, even the imagination-like experiences would be optional.  (Also, if -- following Sosa -- we don't genuinely believe things while dreaming, we could reframe doxastic hallucinations in terms of whatever quasi-belief analogs occur during dreams.)

[The battle at Waterloo: image source]

Monday, November 11, 2024

New in Draft: The Copernican Argument for Alien Consciousness; The Mimicry Argument Against Robot Consciousness

(with Jeremy Pober)

Over the past several years, I've posted a few times on what I call the "Copernican Argument" for thinking that behaviorally sophisticated space aliens would be conscious, even if they are constituted very differently from us (here, here, here, here). I've also posted a few times on what I call the "Mimicry Argument" against attributing consciousness to AI systems or robots that were designed to mimic the superficial signs of human consciousness (including current Large Language Models like ChatGPT and Claude) (here, here, here).

Finally, I have a circulatable paper in draft that deals with these issues, written in collaboration with Jeremy Pober, and tested with audiences at Trent University, Harvey Mudd, New York University, the Agency and Intentions in AI conference in Göttingen, Jagiellonian University, the Oxford Mind Seminar, University of Lisbon, NOVA Lisbon University, University of Hamburg, and the Philosophy of Neuroscience/Mind Writing Group.

It's a complicated paper! Several philosophers have advised me that the Copernican Argument is one paper and the Mimicry Argument is another. Maybe they are right. But I also think that there's a lot to be gained from advancing these arguments side by side: Each shines light on the boundaries of the other. The result, though intricate, is I hope not too intricate for evaluation and comprehensibility. (I might still change my mind about that.)


Abstract:

On broadly Copernican grounds, we are entitled to default assume that apparently behaviorally sophisticated extraterrestrial entities (“aliens”) would be conscious. Otherwise, we humans would be inexplicably, implausibly lucky to have consciousness, while similarly behaviorally sophisticated entities elsewhere would be mere shells, devoid of consciousness. However, this Copernican default assumption is canceled in the case of behaviorally sophisticated entities designed to mimic superficial features associated with consciousness in humans (“consciousness mimics”), and in particular a broad class of current, near-future, and hypothetical robots. These considerations, which we formulate, respectively, as the Copernican and Mimicry Arguments, jointly defeat an otherwise potentially attractive parity principle, according to which we should apply the same types of behavioral or cognitive tests to aliens and robots, attributing or denying consciousness similarly to the extent they perform similarly. Instead of grounding speculations about alien and robot consciousness in metaphysical or scientific theories about the physical or functional bases of consciousness, our approach appeals directly to the epistemic principles of Copernican mediocrity and inference to the best explanation. This permits us to justify certain default assumptions about consciousness while remaining to a substantial extent neutral about specific metaphysical and scientific theories.

Full paper here.


As always, questions/comments/objections welcome here on the blog, on my social media accounts, or by email to my UCR address.

[image source]

Wednesday, October 30, 2024

The Ethics of Harmonizing with the Dao

Reading the ancient Chinese philosophers Xunzi and Zhuangzi, I am inspired to articulate an ethics of harmonizing with the dao (the "way"). This ethics doesn't quite map onto any of the three conceptualizations of ethics that are standard in Western philosophy (consequentialism, deontology, and virtue ethics), nor is it exactly a "role ethics" of the sort sometimes attributed to ancient Confucians.

Xunzi

The ancient Confucian Xunzi articulates a vision of the world in which Heaven, Earth, and humanity operate in harmony:

Heaven has its proper seasons,
Earth has its proper resources,
And humankind has its proper order,
-- this is called being able to form a triad
(Ch 17, l. 34-37; Hutton trans. 2014, p. 176).

Heaven (tian, literally the sky, but with strong religious associations) and Earth are jointly responsible for what we might now call the "laws of nature" and all "natural" phenomena -- including, for example, the turning of the seasons, the patterns of wind and rain, the tendency for plants and animals to thrive under certain conditions and wither under other conditions. Also belonging to these natural phenomena are the raw materials with which humans work: not only the raw materials of wood, metal, and fiber, but also the raw material of natural human inclinations: our tendency to enjoy delicious tastes, our tendency to react angrily to provocations, our general preference for kin over strangers.

Xunzi views humanity's task as creating the third corner of a triad with Heaven and Earth by inventing customs and standards of proper behavior that allow us to harmonize with Heaven and Earth, and with each other. For example, through trial and error, our ancestors learned the proper times and methods for sowing and reaping, how to regulate flooding rivers, how to sharpen steel and straighten wood, how to make pots that won't leak, how to make houses that won't fall over, and so on. Our ancestors also -- again through trial and error -- learned the proper rituals and customs and standards of behavior that permit people to coexist harmoniously with each other without chaotic conflict, without excessive or inappropriate emotions, and with an allocation of goods that allow all to flourish according to their status and social role.

Following the dao can be conceptualized for Xunzi, then, as aligning harmoniously into this triad. Abide by the customs and standards of behavior that contribute to the harmonious whole, in which crops are properly planted, towns are properly constructed, the crafts flourish, and humans thrive in an orderly society.

Each of us has a different role, in accord with the proper customs of a well-ordered society: the barley farmer has one role, the soldier another role, the noblewoman yet another, the traveling merchant yet another. It's not unreasonable to view Xunzi's ethics as a kind of role ethics, according to which the fundamental moral principle is that one adheres to one's proper role in society. It's also not unreasonable to think of the customs and standards of proper behavior as a set of rules to which one ought to adhere (those rules applying in different ways according to one's position in society), and thus to view Xunzi's ethics as a kind of deontological (rule-based) ethics. However, there might also be room to interpret harmonious alignment with the dao as the most fundamental feature of ethical behavior. Adherence to one's role and to the proper traditional customs and practices, on this interpretation of Xunzi, would be only derivatively good, because doing so typically constitutes harmonious alignment.

A test case is to imagine, through Xunzi's eyes, whether a morally well-developed sage might be ethically correct sometimes to act contrary to their role and to the best traditional standards of good behavior, if they correctly see that by doing so they contribute better to the overall harmony of Heaven, Earth, and humankind. I'm tempted to think that Xunzi would indeed permit this -- though only very cautiously, since he is pessimistic about the moral wisdom of ordinary people -- and thus that for him harmonious alignment with the dao is more fundamental than roles and rules. However, I'm not sure I can find direct textual support in favor of this interpretation; it's possible I'm being overly "charitable".

[image source]

A Zhuangzian Correction

A Xunzian ethics of this sort is, I think, somewhat attractive. But it is also deeply traditionalist and conformist in a way I find unappealing. It could use a Zhuangzian twist -- and the idea of "harmonizing with the dao" is at least as Zhuangzian (and "Daoist") as it is Confucian.

Zhuangzi imagines a wilder, more wondrous cosmos than Xunzi's neatly ordered triad of Heaven, Earth, and humankind -- symbolized (though it's disputable how literally) by people so enlightened that they can walk without touching the ground; trees that count 8000 years as a single autumn; gracious emperors with no eyes, ears, nose, or mouth; people with skin like frost who live by drinking dew; enormous, useless trees who speak to us in dreams; and more. This is the dao, wild beyond human comprehension, with which Zhuangzi aims to harmonize.

There are, I think, in Zhuangzi's picture -- though he would resist any effort to fully capture it in words -- ways of flowing harmoniously along with this wondrous and incomprehensible dao and ways of straining unproductively against it. One can be easygoing and open-minded, welcome surprise and difference, not insist on jamming everything into preconceived frames and plans; and one can contribute to the delightful weirdness of the world in one's own unique way. This is Zhuangzian harmony. You become a part of a world that is richer and more wondrous because it contains you, while allowing other wonderful things to also naturally unfold.

In a radical reading of Zhuangzi, ethical obligations and social roles fall away completely. There is little talk in Zhuangzi's Inner Chapters, for example, of our obligation to support others. I don't know that we have to read Zhuangzi radically; but regardless of that question of interpretation, I suggest that there's an attractive middle between Xunzi's conventionalism and Zhuangzi's wildness. Each can serve as a corrective to the other.

In the ethical picture that emerges from this compromise, we each contribute uniquely to a semi-ordered cosmos, participating in social harmony, but not rigidly -- also transcending that harmony, breaking rules and traditions for the better, making the world richer and more wondrous, each in our diverse ways, while also supporting others who contribute in their different ways, whether those others are human, animal, plant, or natural phenomena.

Contrasts

This is not a consequentialist ethics: It is not that our actions are evaluated in terms of the good or bad consequences they have (and still less that the actions are evaluated by a summation of the good minus the bad consequences). Instead, harmonizing with the dao is to participate in something grand, without need of a further objective. Like the deontologist, Xunzi and Zhuangzi and my imagined compromise philosopher needn't think that right or harmonious action will always have good long-term results. Nor is it a deontological or role ethics: There is no set of rules one must always follow or some role one must always adhere to. Nor is it a virtue ethics: There is no set of virtues to which we all must aspire or a distinctive pattern of human flourishing that constitutes the highest attainment. We each contribute in different ways -- and if some virtues often prove to be important, they are derivatively important in the same way that rules and roles can be derivatively important. They are important only because, and to the extent, having those virtues enables or constitutes one's contribution to the magnificent web of being.

So although there are resonances with the more pluralistic forms of consequentialism, and virtue ethics, and role ethics, and even deontology (trivially or degenerately, if the rule is just "harmonize with the dao"), the classical Chinese ethical ideal of harmonizing with the dao differs somewhat from all of these familiar (to professional philosophers) Western ethical approaches.

Many of these other approaches also contain an implicit intellectualism or elitism, in which ideal ethical goodness requires intellectual attainment: wisdom, or a sophisticated ability to weigh consequences or evaluate and apply rules -- far beyond, for example, the capacities of someone with severe cognitive disabilities. With enough Zhuangzi in the mix, such elitism evaporates. A severely cognitively disabled person, or a magnificently weird nonhuman animal, might far exceed any ordinary adult philosopher in their capacity to harmonize with the dao and might contribute more to the rich tapestry of the world.

Perhaps an ethics of harmonizing with the dao can resonate with some 21st-century Anglophone readers, despite its origins in ancient China. It is not, I think, as alien as it might seem from its reliance on the concept of dao and its failure to fit into the standard ethical triumvirate of consequentialism, deontology, and virtue ethics. The fundamental idea should be attractive to some: We each contribute by instantiating a unique piece of a magnificent world, a world which would be less magnificent without us.

Tuesday, October 22, 2024

An Objection to Chalmers's Fading Qualia Argument

[Note: This is a long and dense post. Buckle up.]

In one chapter of his influential 1996 book, David Chalmers defends the view that consciousness arises in virtue of the functional organization of the brain rather than in virtue of the brain's material substrate.  That is, if there were entities that were functionally/organizationally identical to humans but made out of different stuff (e.g. silicon chips), they would be just as conscious as we are.  He defends this view, in part, with what he calls the Fading Qualia Argument.  The argument is enticing, but I think it doesn't succeed.

Chalmers, Robot, and the Target Audience of the Argument

Drawing on thought experiments from Pylyshyn, Savitt, and Cuda, Chalmers begins by imagining two cases: himself and "Robot".  Robot is a functional isomorph of Chalmers, but constructed of different materials.  For concreteness (but this isn't essential), we might imagine that Robot has a brain with the exact same neural architecture as Chalmers' brain, except that the neurons are made of silicon chips.

Because Chalmers and Robot are functional isomorphs, they will respond in the same way to all stimuli.  For example, if you ask Robot if it is conscious, it will emit, "Yes, of course!" (or whatever Chalmers would say if asked that question).  If you step on Robot's toe, Robot will pull its foot back and protest.  And so on.

For purposes of this argument, we don't want to assume that Robot is conscious, despite its architectural and functional similarity to Chalmers.  The Fading Qualia Argument aims to show that Robot is conscious, starting from premises that are neutral on the question.  The aim is to win over those who think that maybe being carbon-based or having certain biochemical properties is essential for consciousness, so that a functional isomorph made of the wrong stuff would only misleadingly look like it's conscious.  The target audience for this argument is someone concerned that for all Robot's similar mid-level architecture and all of its seeming "speech" and "pain" behavior, Robot really has no genuinely conscious experiences at all, in virtue of lacking the right biochemistry -- that it's merely a consciousness mimic, rather than a genuinely conscious entity.

The Slippery Slope of Introspection

Chalmers asks us to imagine a series of cases intermediate between him and Robot.  We might imagine, for example, a series each of whose members differs by one neuron.  Entity 0 is Chalmers.  Entity 1 is Chalmers with one silicon chip neuron replacing a biological neuron.  Entity 2 is Chalmers with two silicon chip neurons replacing two biological neurons.  And so on to Entity N, Robot, all of whose neurons are silicon.  Again, the exact nature of the replacements isn't essential to the argument.  The core thought is just this: Robot is a functional isomorph of Chalmers, but constructed of different materials; and between Chalmers and Robot we can construct a series of cases each of which is only a tiny bit different from its neighbors.

Now if this is a coherent setup, the person who wants to deny consciousness to Robot faces a dilemma.  Either (1.) at some point in the series, consciousness suddenly winks out -- between Entity I and Entity I+1, for some value of I.  Or (2.) consciousness somehow slowly fades away in the series.

Option (1) seems implausible.  Chalmers, presumably, has a rich welter of conscious experience (at least, we can choose a moment at which he does).  A priori, it would be odd if the big metaphysical jump from that rich welter of experience to zero experience would occur with an arbitrarily tiny change between Entity I and Entity I+1.  And empirically, our best understanding of the brain is that tiny, single-neuron-and-smaller differences rarely have such dramatic effects (unless they cascade into larger differences).  Consciousness is a property of large assemblies of neurons, robust to tiny changes.

But option (2) also seems implausible, for it would seem to involve massive introspective error.  Suppose that Entity I is an intermediate case with very much reduced, but not entirely absent, consciousness.  Chalmers suggests that instead of having bright red visual experience, Entity I has tepid pink experience.  (I'm inclined to think that this isn't the best way to think about fading or borderline consciousness, since it's natural to think of pink experiences as just different in experienced content from red cases, rather than less experiential than red cases.  But as I've argued elsewhere, genuinely borderline consciousness is difficult or impossible to imaginatively conceive, so I won't press Chalmers on this point.)

By stipulation, since Entity I is a functional isomorph, it will give the same reports about its experience as Chalmers himself would.  In other words, Entity I -- despite being barely or borderline conscious -- will say "Oh yes, I have vividly bright red experiences -- a whole welter of exciting phenomenology!"  Since this is false of Entity I, Entity I is just wrong about that.  But also, since it's a functional isomorph, there's no weird malfunction going on either, that would explain this strange report.  We ordinarily think that people are reliable introspectors of their experience; so we should think the same of Entity I.  Thus, option (2), gradual fading, generates an implausible tension: We have to believe that Entity I is radically introspectively mistaken; but that involves committing to an implausible degree of introspective error.

Therefore, neither option (1) nor option (2) is plausible.  But if Robot were not conscious, either (1) or (2) would have to be true for at least one Entity I.  Therefore, Robot is conscious.  And therefore, functional isomorphism is sufficient for consciousness.  It doesn't matter what materials an entity is made of.

We Can't Trust Robot "Introspection"

I acknowledge that it's an appealing argument.  However, Chalmers' response to option (2) should be unconvincing to the argument's target audience.

I have argued extensively that human introspection, even of currently ongoing conscious experience, is highly unreliable.  However, my reply today won't lean on that aspect of my work.  What I want to argue instead is that the assumed audience for this argument should not think that the introspection (or "introspection" -- I'll explain the scare quotes in a minute) of Entity I is reliable.

Recall that the target audience for the argument is someone who is antecedently neutral about Robot's consciousness.  But of course by stipulation, Robot will say (or "say") the same things about its experiences that Chalmers will say.  Just like Chalmers, and just like Entity I, it will say "Oh yes, I have vividly bright red experiences -- a whole welter of exciting phenomenology!"  The audience for Chalmers' argument must therefore initially doubt that such statements, or seeming statements, as issued by Robot, are reliable signals of consciousness.  If the audience already trusted these reports, there would be no need for the argument.

There are two possible ways to conceptualize Robot's reports, if they are not accurate introspections: (a.) They might be inaccurate introspections.  (b.) They might not be introspections at all.  Option (a) allows that Robot, despite lacking conscious experience, is capable of meaningful speech and is capable of introspecting, though any introspective reports of consciousness will be erroneous.  Option (b) is preferred if we think that genuinely meaningful language requires consciousness and/or that no cognitive process that fails to target a genuinely conscious experience in fact deserves to be called introspection.  On option (b) Robot only "introspects" in scare quotes.  It doesn't actually introspect.

Option (a) thus assumes introspective fallibilism, while option (b) is compatible with introspective infallibilism.

The audience who is to be convinced by the slow-fade version of the Fading Qualia Argument must both trust the introspective reports (or "introspective reports") of the intermediate entities while not trusting those of Robot.  Given that some of the intermediate entities are extremely similar to Robot -- e.g., Entity N-1, who is only one neuron different -- it would be awkward and implausible to assume reliability for all the intermediate entities while not doing so for Robot.

Now plausibly, if there is a slow fadeout, it's not going to be still going on with an entity as close to Robot as Entity N-1, so the relevant cases will be somewhere nearer the middle.  Stipulate, then, two values I and J not very far separated (0 < I < J < N) such that we can reasonably assume that if Robot in nonconscious, so is Entity J, while we cannot reasonably assume that if Robot is nonconscious, so is Entity I.  For consistency with their doubts about the introspective reports (or "introspective reports") of Robot, the target audience should have similar doubts about Entity J.  But now it's unclear why they should be confident in the reports of Entity I, which by stipulation is not far separated from Entity J.  Maybe it's a faded case, despite its report of vivid experience.

Here's one way to think about it.  Setting aside introspective skepticism about normal humans, we should trust the reports of Chalmers / Entity 0.  But ex hypothesi, the target audience for the argument should not trust the "introspective reports" of Robot / Entity N.  It's then an open question whether we should trust the reports of the relevant intermediate, possibly experientially faded, entities.  We could either generalize our trust of Chalmers down the line or generalize our mistrust of Robot up the line.  Given the symmetry of the situation, it's not clear which the better approach is, or how far down or up the slippery slope we should generalize the trust or mistrust.

For Chalmers' argument to work, we must be warranted in trusting the reports of Entity I at whatever point the fade-out is happening.  To settle this question, Chalmers needs to do more than appeal to the general reliability of introspection in normal human cases and the lack of functional differences between him, Robot, and the intermediate entities.  Even an a priori argument that introspection is infallible will not serve his purposes, because then the open question becomes whether Robot and the relevant intermediate entities are actually introspecting.

Furthermore, if there is introspective error by Entity I, there's a tidy explanation of why that introspective error would be unsurprising.  For simplicity, assume that introspection occurs in the Introspection Module located in the pineal gland, and that it works by sending queries to other parts of the brain, asking questions like "Hey, occipital lobe, is red experience going on there right now?", reaching introspective judgments based on the signals that it gets in reply.  If Entity I has a functioning, biological Introspection Module but a replaced, silicon occipital lobe, and if there really is no red experience going on in the occipital lobe, we can see why Entity I would be mistaken: Its Introspection Module is getting exactly the same signal from the occipital lobe as it would receive if red experience were in fact present.

It's highly doubtful that introspection is as neat a process as I've just described.  But the point remains.  If Entity I is introspectively unreliable, a perfectly good explanation beckons: Whatever cognitive processes subserve the introspective reporting are going to generate the same signals -- including misleading signals, if experience is absent -- as they would in the case where experience is present and accurately reported.  Thus, unreliability would simply be what we should expect.

Now it's surely in some respects more elegant if we can treat Chalmers, Robot, and all the intermediate entities analogously, as conscious and accurately reporting their experience.  The Fading Qualia setup nicely displays the complexity or inelegance of thinking otherwise.  But the intended audience of the Fading Qualia argument is someone who wonders whether experience tracks so neatly onto function, someone who suspects that nature might in fact be complex or inelegant in exactly this respect, such that it's (nomologically/naturally/scientifically) possible to have a behavioral/functional isomorph who "reports" experiences but who in fact entirely lacks them.  The target audience who is initially neutral about the consciousness of Robot should thus remain unmoved by the Fading Qualia argument.

This isn't to say I disagree with Chalmers' conclusion.  I've advanced a very different argument for a similar conclusion: The Copernican Argument for Alien Consciousness, which turns on the idea that it's unlikely that, among all behaviorally sophisticated alien species of radically different structure that probably exist in the universe, humans would be so lucky as to be among the special few with just the right underlying stuff to be conscious.  Central to the Fading Qualia argument in particular is Chalmers' appeal to the presumably reliable introspection of the intermediate entities.  My concern is that we cannot justifiably make that presumption.

Dancing Qualia

Chalmers pairs the Fading Qualia argument with a related but more complex Dancing Qualia argument, which he characterizes as the stronger of the two arguments.  Without entering into detail, Chalmers posits for sake of reductio ad absurdum that the alternative medium (e.g., silicon) hosts experiences but of a different qualitative character (e.g., color inverted).  We install a system in the alternative medium as a backup circuit with effectors and transducers to the rest of the brain.  For example, in addition to having a biological occipital lobe, you also have a functionally identical silicon backup occipital lobe.  Initially the silicon occipital lobe backup circuit is powered off.  But you can power it on -- and power off your biological occipital lobe -- by flipping a switch.  Since the silicon lobe is functionally identical to the biological lobe, the rest of the brain should register no difference.

Now, if you switch between normal neural processing and the backup silicon processor, you should have very different experience (per the assumption of the reductio) but you should not be able to introspectively report that different experience (since the backup circuit interacts identically with the rest of the brain).  That would again be a strange failure of introspection.  So (per the rules of reductio) we conclude that the initial premise was mistaken: Normal neural processing should generate the same types of experience as functionally identical processing in a silicon processor.

(I might quibble that you-with-backup-circuit is not functionally isomorphic to you-without-backup-circuit -- after all, you now have a switch and two different parallel processor streams -- and if consciousness supervenes on the whole system rather than just local parts, that's possibly a relevant change that will cause the experience to be different from the experience of either an unmodified brain or an isomorphic silicon brain.  But set this issue aside.)

The Dancing Qualia argument is vulnerable on the introspective accuracy assumption, much as the Fading Qualia argument is.  Again for simplicity, suppose a biological Introspection Module.  Suppose that what is backed up is the portion of the brain that is locally responsible for red experience.  Ex hypothesi, the silicon backup gives rise to non-red experience but delivers to the Introspection Module exactly the same inputs as that module would normally receive from an organic brain part experiencing red.  This is exactly the type of case where we should expect introspection to be unreliable.

Consider an analogous case of vision.  Looking at a green tree 50 feet away in good light, my vision is reliable.  Now substitute a red tree in the same location and a mechanism between me and the tree such that all the red light is converted into green light, so that I get exactly the same visual input I would normally receive from looking at a green tree.  Even if vision is highly reliable in normal circumstances, it is no surprise in this particular circumstance if I mistakenly judge the red tree to be green!

As I acknowledged before, this is a cartoon model of introspection.  Here's another way introspection might work: What matters is what is represented in the Introspection Module itself.  So if the introspection module says "red", necessarily I experience red.  In that case, in order to get Dancing Qualia, we need to create an alternate backup circuit for the Introspection Module itself.  When we flip the switch, we switch from Biological Introspection Module to Silicon Introspection Module.  Ex hypothesi, the experiences really are different but the Introspection Module represents them functionally in the same way, and the inputs and outputs to and from the rest of the brain don't differ.  So of course there won't be any experiential difference that I would conceptualize and report.  There would be some difference in qualia, but I wouldn't have the conceptual tools or memorial mechanisms to notice or remember the difference.

This is not obviously absurd.  In ordinary life we arguably experience minor version of this all the time: I experience some specific shade of maroon.  After a blink, I experience some slightly different shade of maroon.  I might entirely fail to conceptualize or notice the difference: My color concepts and color memory are not so fine grained.  The hypothesized red/green difference in Dancing Qualia is a much larger difference -- so it's not a problem of fineness of grain -- but fundamentally the explanation of my failure is similar: I have no concept or memory suited to track the difference.

On more holist/complicated views of introspection, the story will be more complicated, but I think the burden of proof would be on Chalmers to show that some blend of the two strategies isn't sufficient to generate suspicions of introspective unreliability in the Dancing Qualia cases.

Related Arguments

This response to the Fading Qualia argument draws on David Billy Udell's and my similar critique of Susan Schneider's Chip Test for AI consciousness (see also my chapter "How to Accidentally Become a Zombie Robot" in A Theory of Jerks and Other Philosophical Misadventures).

Although this critique of the Fading Qualia argument has been bouncing around in my head since I first read The Conscious Mind in the late 1990s, it felt a little complex for a blog post but not quite enough for a publishable paper.  But reading Ned Block's similar critique in his 2023 book has inspired me to express my version of the critique.  I agree with Block's observations that "the pathology that [Entity I] has [is] one of the conditions that makes introspection unreliable" (p. 455) and that "cases with which we are familiar provide no precedent for such massive unreliability" (p. 457).

Thursday, October 17, 2024

Join My Graduate Seminar on Robot, Alien, and AI Consciousness

This coming winter quarter (Jan 10 - Mar 20), I'll be teaching a graduate seminar on "Robot, Alien, and AI Consciousness".  As an experiment, I am inviting up to five PhD students or postdoctoral students in philosophy from outside UC Riverside to participate remotely in the course.  If five remote students do join the course, I will convert the entire course to remote (through Zoom) so that all participants are on an equal footing.  (If fewer than five join, I will make the course hybrid.  We have good hybrid technology -- e.g., a huge projector screen -- so hybrid students will be well incorporated into the class.)

I've never done anything like this.  I am inspired by Myisha Cherry's (more ambitious) fellows program for graduate students working on emotion, which builds community across campuses among graduate students working on that topic.  We'll see how it goes.  It might be awesome.  It might be a dud.

Course Description:
We will attempt to assess under what conditions we would be warranted in thinking that a robot, AI system, or naturally-evolved space alien would, or would not, be conscious.  Readings will mostly be philosophy but will also include selections in science fiction, Artificial Intelligence research, and astrobiology.  (I haven't yet finalized the reading list and suggestions are welcome.)

Meeting Times and Requirements:
The course will meet every Friday from 2:00-4:50 pm Pacific Time on Zoom, from Jan 10 to Mar 14.  Students taking the course S/NC will submit brief written weekly reflections.  Students taking the course for a grade should also submit a final paper by Mar 20 (extensions liberally granted).

Eligibility for Non-UCR Students:
To be eligible to participate in the course, you should be fluent in English and either enrolled as a PhD student in Philosophy or working in a paid postdoctoral position in Philosophy.  It is not required that your university permit you to officially enroll in the course for credit (though I welcome such arrangements).  At least one prior upper-division or grad-level philosophy of mind class is required.

Application Procedure for Non-UCR Students:
Email me a CV (including relevant past courses), a writing sample, and a one-paragraph statement expressing your background and/or interests and/or plans on the topic.  Also, indicate whether you plan to write a graded final paper.  (My UCR email is widely discoverable; I won't risk increasing the volume of spam by printing it here.)

If there are more than five eligible applicants, I will select among them based on considerations of strength of background, diversity and relevance of interests and background, and strength and relevance of the writing sample.  If, but only if, other factors are approximately equal, students who plan to submit written work for a grade will be preferred over those who plan only to attend and write the brief weekly reflections.

Deadline
Apply by Nov 5.  I will reply with a decision by Nov 15.

Thursday, October 10, 2024

A Dispositionalist Approach to Desire and Valuing

What is it to desire or value something? Is it to feel a certain way? Is it instead to have a certain sort of representational architecture (a stored representation of "X is good" or a representation of X in one's "desire box")? Is it to have a certain type of neurological structure associated with reward and learning?

I hold, instead, that to desire or value something is a matter or being disposed to act and react in a certain characteristic pattern of ways; to desire or value is to have a certain type of habitual posture toward the world. I have long defended a dispositionalist theory of belief (e.g., here and here). In 2013, I extended this dispositionalist theory to "attitudes" generally, explicitly including desiring and valuing. But I have never written a full-length journal article specifically on desire and value. Here's a preliminary sketch of my approach.

Liberal Dispositionalism about Desire

In the 20th century, dispositionalist accounts of attitudes were generally associated with behaviorism: To believe P or desire Q is to be disposed to behave in a particular set of ways. Such accounts fell out of favor as behaviorism fell out of favor. One of the main innovations of 21st-century dispositionalism -- "liberal dispositionalism" as I call it -- was to explicitly put other types of dispositions on equal footing with behavioral dispositions, thus avoiding the troubles that plague behaviorist approaches to the mind.

I favor sorting the dispositions into three broad classes: behavioral, phenomenal (that is, pertaining to conscious experience), and cognitive. Suppose I want my daughter to do well in school. On a liberal dispositional account of desire, to have this desire is neither more nor less than to possess a certain suite of behavioral, phenomenal, and cognitive dispositions. Behaviorally, it is to be disposed, for example, not to interfere with her homework, to encourage her when she does well, to provide the resources she needs to succeed, and so on. Phenomenally, it is to feel good when I hear of her successes, to feel disappointed when she fails, to fantasize happily about good educational outcomes, to feel anxious if she doesn't seem to be putting in the necessary work, and so on. And cognitively, it is to be disposed to enter further mental states under various relevant conditions, such as to engage in certain types of planning and to reject incompatible desires, such as that she drop out of school to pursue a career in fashion.

As in the case of belief, all these dispositions hold only ceteris paribus, that is, other things being equal, or when conditions are normal, absent competing influences. I won't encourage her to do her homework if the house is on fire. And as in the case of belief, few of us will perfectly match any dispositional profile so constructed; it's a matter of whether we match closely enough. A natural comparison is personality traits: To be an extravert is just to match, close enough and ceteris paribus, the dispositional profile constitutive of extraversion: being disposed to enjoy parties, to make friends easily, to take the lead in social situations, and so on.

Every theory of desire will hold that, generally speaking, if one has a desire one also has a certain suite of appropriate dispositions. What is distinctive about dispositionalism is that it says that that is all desire is. Once your dispositional profile is fully characterized, that's the end of the story as far as the existence or non-existence of desire is concerned. Maybe there's some representation in the desire box (if human architecture works a certain way), or maybe the reward system is in some particular state, or maybe you buzz with a certain feeling; or maybe none of that is the case. Such facts, if they are facts, are contingent associations or implementations. Anyone who matches the dispositional profile constitutive of desire to an appropriate degree does desire, regardless of whatever else is true of them; and anyone who does not match that dispositional profile does not desire. If there were space aliens with a radically different cognitive architecture, they would desire if and only if they matched the relevant dispositional profile. Cognitive and physiological architecture is only derivatively important to the metaphysics of desire: It is important only because, and to the extent, it undergirds the dispositional profile.

Short-Term vs. Long-Term Desire

Arguably, there are two very different types of desire: short-term and long-term. I desire (short-term) a beer from my fridge, right now. I also desire (long-term) that my fridge be stocked with beer, in general. Sometimes the first type of desire is called "occurrent" and the second "dispositional". Such occurrent desires plausibly feel a certain way (there's a feeling of craving that beer), while the dispositional desires don't. Maybe only the latter are subject to dispositional analysis.

I reject that view. Short-term and long-term desires can both be analyzed dispositionally and exist on a spectrum rather than being different in kind. The difference lies in the duration of the dispositional structure. If I currently want a beer, I have a suite of dispositions constitutive of that desire: I'm disposed to go the fridge to get it; I would feel disappointed if I discovered I was out of beer; I'm inclined to make a plan to get that beer as soon as the game pauses for commercials. If I lack these dispositions, it's not true that I want a beer. But the dispositions only endure briefly. As soon as I get that beer, they vanish.

Now it might be true that often (even typically) certain feelings tend to accompany short-term desires. But if so, on my view they are signals of the desires or at most the surface manifestations of the dispositional structures constitutive of desire. If the feeling is disconnected from other dispositions, it constitutes a "wraith" of a desire, not a full-blown desire (see Schwitzgebel 2013, sec 11 on "wraiths").

[This kid really wants cake; image source]


In-Between Desire and "Weakness of Will"

Much of my work on belief has emphasized the existence of "in-between believing": cases in which people have some substantial portion of the dispositional profile constitutive of believing some particular proposition but in which they also deviate substantially from that profile, such that it's neither quite right to say they believe nor quite right to say they fail to believe. One plausible case is implicit bias, someone who sincerely affirms (for example) that all the races are intellectually equal, and reasons on that basis in explicit contexts, but who also often acts and reacts as if the races are not all intellectually equal.

Similarly, we can have in-between desires. "Weakness of will" and temptation cases are one plausible category. I'm on a diet. Do I desire to eat the chocolate cake? In a sense, obviously yes. There it is. I can feel myself wanting it. I have an urge to reach out and eat it. Maybe I actually do eat it. At the same time, I'm telling myself "I shouldn't eat that cake". And maybe I do resist. I plan ways to avoid eating the cake -- for example, by turning my eyes away, by telling myself and others that I'm not going to eat it. I say to myself sincerely that I want to refrain from eating it. I'm torn.

We could treat this as two conflicting desires. But dispositionalism gives us a different way of conceptualizing the case. Like someone who has some extraverted dispositions and some introverted dispositions, or someone who has some egalitarian dispositions and some racist dispositions, the cake-tempted-dieter has a mix of dispositions those don't all fall neatly on one side or the other.

We can map this partly (not perfectly) onto the short-term / long-term distinction. If I yield to temptation, probably the short-term dispositions were overall dominant in my profile. As soon as I eat the cake, those dispositions mostly disappear and the long-term dispositions dominate, leaving me with the taste of both cake and regret.

Desire, Valuing, and Believing Good: Overlapping Profiles, not Discrete Representations

So far I have only talked about desiring, not valuing, but I don't think they are different in kind. "Valuing" sounds more long-term and tightly connected to intellectual endorsement. (It's odd to say that I "value" eating the cake.) But the relationship between desiring and valuing is something like the relationship between being brave and being courageous. It's not like people have one brave state or one brave representation and then a separate courageous state or courageous representation. Rather, the dispositional profiles constitutive of bravery and courage largely overlap. "Bravery" tilts perhaps a bit more to the physical and has less of a moral loading than "courage". But central to both dispositional profiles is leaping boldly to the defense of your unjustly attacked friends.

Here's how I express the idea in my 2013 article:

Shortly after moving into one of my residences, I met a nineteen-year-old neighbor. Let's call him Ethan. In our first conversation, it came out (i.) that Ethan had a handsome, expensive new pickup truck, and (ii.) that he unfortunately had to go to community college because he couldn't afford to attend a four-year school. Although I didn't think to ask Ethan whether he thought owning a handsome pickup truck was more important than attending a four-year university, let's suppose that's how he lived his life in general.

Ethan's inward and outward actions and reactions -- perhaps not with perfect consistency -- generally revealed a posture toward the world of valuing his truck over his education, or thinking that it's more important to have a beautiful truck than to go to a demanding university, or wanting a beautiful truck more than wanting to attend a four-year school. On a dispositional stereotype approach to the attitudes, we can treat the stereotypes associated with these somewhat different attitudes as largely overlapping, though with different centers and peripheries. Believing and desiring and valuing would seem on the surface to be very different attitude types, and are often treated as such -- beliefs are "cognitive", desires "conative", they have different "directions of fit", etc. -- and yet in Ethan’s case, the particular belief, desire, and valuation seem only subtly different.

On Not Counting Up the Number of Desires

How many desires do you have? Exactly 4,628,414? Yes, that's precisely the number!

Just kidding of course. The question doesn't even make sense. There is no fact of the matter exactly the number of desires you have. Desires aren't discrete countable things. This fact spells trouble for some excessively realist views of desire that require, for example, that every desire must be underwritten by some particular stored representation. In a forthcoming paper, I argue that this issue creates a morass of problems for representationalist accounts of belief, which must either multiply representations implausibly or draw an occult and useless sharp line between "explicit" (stored) and "tacit" (quickly inferrable) beliefs.

Similar problems -- though I won't detail them here -- will arise for any view of desire that grounds desires in countable objects or states. Dispositionalism avoids these problems. There is no countable number of dispositional profiles that you match to a (contextually determined) appropriate degree. To say that someone matches a dispositional profile is like saying that some part of a richly complex figure has a certain approximate shape. There are many ways to characterize the shape of a complex figure, no countable number of shapes to which a complex figure might to some degree conform, and no need for separate storage compartments for each reasonably accurate shape-description. You get an infinite number of dispositions, and an infinite number of finely specified shape profiles, for free, without need to treat each as requiring a distinctly existing, resource-consuming ontological ground.