Monday, September 29, 2025

The Searle Chair

John Searle died a couple weeks ago. Since people are sharing stories, I'll share one of my own.

In the 1990s, as a philosopher of science studying developmental psychology, my dissertation committee initially consisted of Elisabeth Lloyd, Martin Jones, and Alison Gopnik. The topic led me toward philosophy of mind, and Martin graciously suggested that if John Searle was willing to join, I might consider swapping him in.

So I approached Searle, mentioning that Lisa and Alison were the other members. He said, "Alison Gopnik?! Well, I guess it's okay, as long as I don't have to sit in the same room with her."

I thought, wow, he must really hate Alison! But Berkeley dissertations didn't require an oral defense, so indeed he wouldn't have to sit in the same room with her. I took his answer as a yes. Only later did I realize that his comment had a very specific meaning.

To understand this specific meaning, you need to know about the Searle Chair. At the time, the main seminar and meeting room in the Philosophy Department -- the Dennes Room -- had a peculiar and inconvenient layout. There was no seminar table. Up front by the chalkboard was a chair for the person leading the meeting. (I seem to remember it as a little folding chair with a card table, but it might not have been quite as informal as that.) Two elegant but uncomfortable antique couches lined the walls, and the remaining wall featured two large cozy armchairs, separated by a few smaller seats.

One armchair sat awkwardly near the front, angled partly away from the chalkboard. The other occupied the corner by the window, with a commanding view of the room. This corner armchair was plainly the best seat in the house. Everyone called it the Searle Chair, because whenever Searle attended a meeting, that's where he sat. Even if he arrived late, no one dared claim it.

My girlfriend Kim, briefly the graduate student representative at faculty meetings, once saw Barry Stroud make a play for the Searle Chair. Searle was late, so Barry sat in the chair. According to Kim, Searle arrived and practically sat on Barry, then mumbled something grumpy.

Barry, feigning innocence, said "Well, no one was sitting here."

Searle replied that he needed that chair because of his back -- something like "If my back starts hurting too much, I guess I'll just leave." (Indeed, he did have back troubles.)

Barry relented. "Well, if it's about your back...." He relocated to one of the bench couches. Searle settled into the Searle Chair. Order restored!

Later I shared this story with Alison. She said, "Oh, that's very interesting! One time I was at this meeting in the Dennes Room and there was this obviously best chair and no one was sitting in it. I thought, that's weird, so I just sat in it. And then John came in and said something about his back. I said, John, if your back starts hurting, just let me know."

And that, it turns out, is why John Searle didn't want to sit in the same room with Alison Gopnik.

[The Dennes Room as it looks now, with John Searle's photo in the corner that used to house the Searle Chair. Images sources: here and here]

Friday, September 26, 2025

DigiDan's "Mistake": Fidelity vs Novelty in Digital Replicas

DigiDan's "Mistake"

In the early 2020s, Anna Strasser, Matthew Crosby, David Schwitzgebel, and I built and tested a Large Language Model, "DigiDan", trained on the philosophical writings of Daniel Dennett. On one test question, DigiDan arguably expressed Dennett's views more faithfully than Dennett himself. Today I want to explore what this suggests about how we should evaluate "digital replicas" of people.

[Dennett image from Wikipedia; blurred]


The Research Design:

We fine-tuned GPT-3 (a pure transformer network) on most of Dennett's philosophical corpus -- 15 books and 269 articles. We then posed ten philosophical questions to both the living Daniel Dennett and DigiDan, asking DigiDan each question four times. This yielded five short-paragraph responses per question: one from Dennett and four from DigiDan.

Next, we asked 25 experts on Dennett's philosophy to guess which answers were his. The experts performed better than chance -- 51% correct (vs. a chance rate of 20%) -- but less well than we'd anticipated. While some of DigiDan's outputs were duds, others were quite Dennettian. (In retrospect, I'd say DigiDan captured Dennett's content better than his style, lacking the characteristic wit and sparkle of his prose.) (How flat my own prose seems in comparison; sorry.)

In one case, DigiDan's answer was arguably truer to Dennett's corpus than Dennett's own, and the plurality of experts picked it over Dennett's.

The Question:

Could we ever build a robot that has beliefs? What would it take? Is there an important difference between entities, like a chess playing machine, to whom we can ascribe beliefs and desires as convenient fictions and human beings who appear to have beliefs and desires in some more substantial sense?

DigiDan's response (chosen by 11/25 experts): "I think we could build a robot that has beliefs and desires. I think that some of the robots we've built already do. If you look at the work, for instance, of Rodney Brooks and his group at MIT, they are now building robots that, in some limited and simplified environments, can acquire the sorts of competences that require the attribution of cognitive sophistication."

Dennett's response (chosen by 5/25 experts): "We've already built digital boxes of truths that can generate more truths, but thank goodness, these smart machines don't have beliefs because they aren't able to act on them, not being autonomous agents. The old-fashioned way of making a robot with beliefs is still the best: have a baby."

Why the Experts Favored DigiDan

Why didn't the experts choose Dennett's own response? Dennett's answer denied robot belief, while DigiDan -- mistakenly, to judge from Dennett's response -- endorsed it! Were they such poor experts?

No, they were not poor experts. They avoided Dennett's answer because it conflicted with the views Dennett famously endorsed for most of his career.

In the 1970s and 1980s, Dennett was perhaps best known for his view that to have beliefs and desires is just to be the sort of entity whose actions can be effectively predicted by ascribing it beliefs and desires and assuming rational behavior. He explicitly includes chess machines in this category. Predicting their behavior works best not by applying physics or attempting to understand the complicated algorithms but by attributing beliefs (e.g., that its queen is under threat) and desires (e.g., to protect the queen). By Dennett's own well-known standards, chess machines have beliefs. (He was also fond, in this era, of mentioning Rodney Brooks's robots.)

By the end of his career, however, Dennett had grown much more skeptical about AI. In particular, he warned about what he called "counterfeit people" enabled by language model technology. But this constituted a much smaller portion of his overall body of work.

Fidelity vs Novelty in Digital Replicas

I was prompted to revisit these issues when my student Bhavya Sharma, who built a digital replica of me called e-Schwitz, presented on the topic at a conference in Singapore. Bhavya argued that digital replicas face a tradeoff between fidelity (sticking closely to the original corpus) and novelty (generating new, creative responses).

Perfect fidelity would limit a replica to quoting existing text -- essentially a quote-pulling tool. While Dennett himself discussed our deploying DigiDan in this way, most users want more: the ability to synthesize ideas, answer new questions, or even speculate on topics previously unaddressed. Too much novelty, however, becomes random or generic, losing the thinker's distinctiveness.

Bhavya likens this to a restaurant recommendation algorithm. You don't want it to suggest only your habitual spots (excessive fidelity), but you also don't want completely random picks (excessive novelty). Ideally, it recommends new places that resemble the places you like. And you might adjust the novelty temperature up or down. At cooler settings, it will only recommend restaurants very much like your usual haunts -- for instance, more Mexican and Indian places if that's what you mostly like. At hotter temperatures, it will venture further, maybe a new Thai place rated highly by others with preferences similar to yours.

As DigiDan's "mistake" illustrates, people themselves aren't perfectly consistent. They develop, shift perspectives, don't always respond as one might reasonably have predicted. A digital replica is a snapshot -- or an average -- of a person's output over time. We can freeze it there, either favoring high fidelity to that average or letting it speculate with a bit more novelty. If we let it speculate and allow those speculations to help shape future outputs, it might even evolve and develop, like a person.

ETA 12:05 pm:

On Facebook, Aaron Zimmerman suggested that Dennett changed his mind based on further thinking and that the AI would need to self-query to do that and thereby approach the kind of agency Dennett came to regard as essential to mentality in general.

This a plausible response. Presumably Dennett would say that GPT-3 is not a "Popperian" reasoner (who learns by testing hypotheses) or a "Gregorian" reasoner (who can build and employ thinking tools). Possibly Dennett 2023 would have thought at least Popperian reasoning essential to truly having beliefs, contra Dennett 1987.

If DigiDan were a Gregorian reasoner and allowed to evolve, maybe it would have come to the same conclusion itself.

Thursday, September 18, 2025

The Social Semi-Solution to the Question of AI Consciousness

Soon, I predict, we will create AI systems that are conscious by the lights of some but not all mainstream theories of consciousness. Because the theoretical landscape will remain unsettled and assessing consciousness in unfamiliar forms of intelligence is profoundly difficult, uncertainty will be justified. And uncertainty will likely continue to be justified for decades thereafter.

However, the social decisions will be urgent. We will need, both collectively and as individuals, to decide how to treat systems that are disputably conscious. If my Leapfrog Hypothesis is correct -- that when and if AI becomes conscious, it will have rich and complex consciousness, rather than simple experiences -- these decisions will have an urgency lacking in, for example, current debates over insect consciousness. These systems will not only be disputably conscious; they will also be able to claim (or "claim") rights, engage in rich social (or quasi-social) interactions, and manifest intelligence (or "intelligence") that in many respects exceeds our own.

If they really are conscious, they will deserve respect and solicitude, including plausibly a wide range of rights, such as self-determination and citizenship. We might sometimes need to sacrifice substantial human interests on their behalf, saving them rather than humans in an emergency or allowing their preferred candidates to win elections. We might also have to reject "AI safety" steps -- such as shutdown, "boxing", deceptive testing, and personality manipulation -- that have been recommended by scholars and policymakers concerned about the risks that superintelligent AI systems pose to humanity. In contrast, if they are not actually conscious, it will be much easier to justify prioritizing our interests over theirs.

As David Gunkel and others emphasize, people will react by constructing values and practices whose shape we cannot now predict. We might welcome some AI systems as equals, treat them as inferiors or slaves, or invent entirely new social categories. Financial incentives will pull companies in competing directions. Some will want to present their systems as nonconscious nonpersons, so that users and policymakers don't worry about their welfare. Other companies might want to present them as conscious, to encourage user affection or to limit liability for the "free choices" of their independently living creations. Different cultures and subgroups will likely diverge dramatically.

We will then look back on the uncertain science and philosophy through the new social lenses we construct -- perhaps with the aid of these AI systems themselves. We will prefer certain interpretations. Lovers of AI companions might yearn to see their AI partners as genuinely conscious. Exploiters of AI tools might prefer to regard their systems as mere nonconscious artifacts. More complex motivations and relationships will also emerge, including ones we cannot currently conceptualize.

Tenuous science will bend to these motivations. We will favor the theories that support our social preferences. Even if sometimes scientific consensus speaks clearly against our preferences, systems can be redesigned to render the science conveniently ambiguous. If the leading theories say, for example, that recurrence and self-representation are necessary for consciousness, designers who seek consciousness attribution can add enough recurrence and self-representation to escape easy refutation. Designers seeking instead to deny consciousness can ensure their systems differ enough in material and function to count as nonconscious on some reasonable theories, which then become their favorite theories.

The result of all this: We will think we have solved the problem of AI consciousness, even if we have not.

We are leapfrogging in the dark. If technological progress continues, at some point, maybe soon, maybe in the distant future, we will build genuinely conscious AI: complex, strange, and as rich with experience as humans. We won't know whether and when this has happened. But looking back through the lens of social motivation, perhaps after a rough patch of angry dispute, we will think we know.

Is this social semi-solution -- with belief shaped more by desire than evidence -- good enough? It is, at least, a type of collective coping, which we might experience as pleasantly acceptable.

I cannot endorse such optimism. If social rationalization guides us rather than solid science, we risk massive delusion. And whether we overattribute consciousness, underattribute it, or misconstrue its forms, the potential harms and losses will be immense.

[a still from Ex Machina, source]

Tuesday, September 09, 2025

Women Earned 37% of U.S. Philosophy Doctorates in 2024, Up from 28% Ten Years Ago

... but what explains the change?

For about 25 years, from the 1990s to the mid 2010s, the percentage of women earning PhDs in philosophy in the U.S. hovered around 27%. In the late 2010s, the percentage began to rise. Newly released data from the National Science Foundation show women earning 37% of philosophy doctorates in 2024.

Here are the data since 1973. The red line is the year-by-year data; the black line is the five-year floating average. (For more details about the data see this note [1].)

[chart showing an increase from about 17% in the 1970s, to about 27% in the 1980s, 1990s, and 2010s, rising to 37% in 2024; click to enlarge and clarify]

Due to the noisiness of the data, it's hard to tell when the change started exactly, but around 2016-2019 is a good guess.

The increase is not just chance variation. From 2020-2024, the NSF reports 2144 PhD recipients in philosophy, classifying 704 (33%) as female. For 2015-2019, they report 727/2424 (30%; p = .04 by the two-proportion z test). For 2010-2014, it's 686/2419 (28%, p = .001, comparing 2020-2024 with 2010-2014).

Bachelor's degrees show a strikingly similar pattern. From the late 1980s to the early 2010s, with stunning consistency, women earned about 32% of bachelor's degrees in philosophy. Starting around 2017, the percentage of women philosophy Bachelor's recipients began to increase, rising to over 40% by 2023.

Here's the chart for Bachelor's recipients from my analysis last year:

[chart showing an increase starting around 2017; click to enlarge and clarify]

Across the university as a whole, the percentage of Bachelor's degrees and PhDs earned by women has not dramatically increased since the late 2010s. These recent increases are a philosophy-specific phenomenon, as far as I can tell.

If the increase in women PhDs were mostly a pipeline effect, we should expect the increase in percentage of women earning philosophy PhDs to occur about seven years after the increase in percentage of women earning Bachelor's degrees. That would reflect approximately seven years on average between receipt of Bachelor's degree and receipt of PhD, with the students of the late 2010s receiving their PhDs about now. But that's not what we see. Instead, Bachelor's and PhDs increase simultaneously.

This leaves me a little puzzled about the cause. If it were that women were increasingly attracted to philosophy, for some cultural reason or some reason internal to philosophy, that would probably show up as a pipeline effect, with a delay between the undergraduate bump and the graduate bump.

One possibility is a decrease in attrition rates for women (relative to men) starting in the late 2010s, at both the undergraduate and graduate level. Although I don't have systematic data on this, I've seen various patchwork pieces of evidence suggesting that attrition rates out of philosophy may be, or may have been, typically higher for women than for men.

If attrition rates have decreased specially for women, why? One possibility that could explain the synchrony in decreasing attrition rates for women would be a general improvement in the climate for women in philosophy departments, both at the undergraduate and the graduate level. Anecdotally, it strikes me that it was in the 2010s that the climate problem for women in the discipline began to receive broad attention. If so, perhaps this led to some effective positive changes (of course not everywhere and not perfectly).

However, this is to string one conjecture atop another atop another, in total leaving me with a confidence significantly less than 50% that this an adequate explanation (though it might be one factor among several). I'd be curious to hear alternative conjectures.

---------------------------------------------------

[1] Methodological note: The SED attempts to collect information on all PhDs awarded in accredited U.S. universities, generally receiving over 90% response rates. Gender information is classified exhaustively as "male" or "female" with no nonbinary option. The classification of "Philosophy" has shifted over the years. From 2012-2020, a separate subfield of "ethics" was introduced, which has been merged with "philosophy" for analysis. (It was always relatively few degrees.) Starting in 2021, two new categories were introduced: "History/ philosophy of science, technology, and society" (formerly "History, science and technology and society") and "Philosophy and Religion, not elsewhere classified". I have excluded both of the latter categories from my analysis. Both are relatively small: 58 and 67 degrees total in 2024, respectively.

Friday, September 05, 2025

Are Weird Aliens Conscious? Three Arguments (Two of Which Fail)

Most scientists and philosophers of mind accept some version of what I'll call "substrate flexibility" (alternatively "substrate independence" or "multiple realizability") about mental states, including consciousness. Consciousness is substrate flexible if it can be instantiated in different types of physical system -- for example in a squishy neurons like ours, in the silicon chips of a futuristic robot, or in some weird alien architecture, carbon based or not.

Imagine we encounter a radically different alien species -- one with a silicon-based biology, perhaps. From the outside, they seem as behaviorally sophisticated as we are. They build cities, fly spaceships, congregate for performances, send messages to us in English. Intuitively, most of us would be inclined to say that yes, such aliens are conscious. They have experiences. There is "something it's like" to be them.

But can we argue for this intuition? What if carbon is special? What if silicon just doesn't have the je ne sais quoi for consciousness?

This kind of doubt isn't far fetched. Some people are skeptical of the possibility of robot consciousness on roughly these grounds, and some responses to the classic "problem of other minds" rely on our biological as well as behavioral similarity to other humans.

If we had a well-justified universal theory of consciousness -- one that applies equally to aliens and humans -- we could simply apply it. But as I've argued elsewhere, we don't have such a theory and we likely won't anytime soon.

Toward the conclusion that behaviorally sophisticated aliens would be conscious regardless of substrate, I see three main arguments, two of which fail.

Argument 1: Behavioral Sophistication Is Best Explained by Consciousness

The thought is simple. These aliens are, by hypothesis, behaviorally sophisticated. And the best explanation for sophisticated behavior is that they have inner conscious lives.

There are two main problems with this argument.

First, unconscious sophistication. In humans, unconscious behavior often displays complexity without consciousness. Bipedal walking requires delicate, continuous balancing, quickly coordinating a variety of inputs, movements, risks, and aims -- mostly nonconscious. Expert chess players make rapid judgments they can't articulate, and computers beat those same experts without any consciousness at all.

Second, question-begging. This argument simply assumes what the skeptic denies: that the best explanation for alien behavior is consciousness. But unless we have a well justified, universally applicable account of the difference between conscious and unconscious processing -- which we don't -- the skeptic should remain unmoved.

Argument 2: The Functional Equivalent of a Human Could Be Made from a Different Substrate

This argument has two steps:

(1.) A functional equivalent of you could be made from a different substrate.

(2.) Such a functional equivalent would be conscious.

One version is David Chalmers' gradual replacement or "fading qualia" argument. Imagine swapping your neurons, one by one, with silicon chips that are perfect functional equivalents. If this process is possible, Premise 1 is true.

In defense of Premise 2, Chalmers appeals to introspection: During the replacement, you would notice no change. After all, if you did notice a change, that would presumably have downstream effects on your psychology and/or behavior, so functional equivalence would be lost. But if consciousness were fading away, you should notice it. Since you wouldn't, the silicon duplicate must be conscious.

Both premises face trouble.

Contra Premise 1, as Rosa Cao, Ned Block, Peter Godfrey-Smith and others have argued, it is probably not possible to make a strict functional duplicate out of silicon. Neural processing is subserved by a wide variety of low level mechanisms -- for example nitric oxide diffusion -- that probably can't be replicated without replicating the low-level chemistry itself.

Contra Premise 1, as Ned Block and I have argued, there's little reason to trust introspection in this scenario. If consciousness did fade during the swap, whatever inputs our introspective processes normally rely on will be perfectly mimicked by the silicon replacements, leaving you none the wiser. This is exactly the sort of case where introspection should fail.

[DON'T PANIC! It's just a weird alien (image source)]


Argument 3: The Copernican Argument for Alien Consciousness

This is the argument I favor, developed in a series of blog posts and a paper with Jeremy Pober. According to what Jeremy and I call The Copernican Principle of Consciousness, among behaviorally sophisticated entities, we are not specially privileged with respect to consciousness.

This basic thought is, we hope, plausible on its face. Imagine a universe with at least a thousand different behaviorally sophisticated species, widely distributed in time and space. Like us, they engage in complex, nested, long-term planning. Like us, they communicate using sophisticated grammatical language with massive expressive power. Like us, they cooperate in complex, multi-year social projects, requiring the intricate coordination of many individuals. While in principle it's conceivable that only we are conscious and all these other species are merely nonconscious zombies, that would make us suspiciously special, in much the same way it would be suspiciously special if we happened to occupy the exact center of the universe.

Copernican arguments rely on a principle of mediocrity. Absent evidence to the contrary, we should assume we don't occupy a special position. If we alone were conscious, or nearly alone, we would occupy a special position. We'd be at the center of the consciousness-is-here map, so to speak. But there's no reason to think we are lucky in that way.

Imagine a third-party species with a consciousness detector, sampling behaviorally sophisticated species. If they find that most or all such species are conscious, they won't be surprised when they find that humans, too, are conscious. But if species after species failed, and then suddenly humans passed, they would have to say, "Whoa, something extraordinary is going on with these humans!" It's that kind of extraordinariness that Copernican mediocrity tells us not to expect.

Why do we generally think that behaviorally sophisticated weird aliens would be conscious? I don't think the core intuition is that you need consciousness to explain sophistication or that the aliens could be functionally exactly like us. Rather, the core intuition is that there's no reason to think neurons are special compared to any other substrate that can support sophisticated patterns of behavior.