Friday, January 30, 2026

Does Global Workspace Theory Solve the Question of AI Consciousness?

Hint: no.

Below are three sections from Chapter Eight of my manuscript in draft, AI and Consciousness, fresh new version available today here. Comments welcome!

[image adapted from Dehaene et al. 2011]


1. Global Workspace Theories and Access.

The core idea of Global Workspace Theory is simple. Sophisticated cognitive systems like the human mind employ specialized processes that operate to a substantial extent in isolation. We can call these modules, without committing to any strict interpretation of that term.[1] For example, when you hear speech in a familiar language, some cognitive process converts the incoming auditory stimulus into recognizable speech. When you type on a keyboard, motor functions convert your intention to type a word like “consciousness” into nerve signals that guide your fingers. When you try to recall ancient Chinese philosophers, some cognitive process pulls that information from memory without (amazingly) clogging your consciousness with irrelevant information about German philosophers, British prime ministers, rock bands, or dog breeds.

Of course, not all processes are isolated. Some information is widely shared, influencing or available to influence many other processes. Once I recall the name “Zhuangzi”, the thought “Zhuangzi was an ancient Chinese philosopher” cascades downstream. I might say it aloud, type it out, use it as a premise in an inference, form a visual image of Zhuangzi, contemplate his main ideas, attempt to sear it into memory for an exam, or use it as a clue to decipher a handwritten note. To say that some information is in “the global workspace” just is to say that it is available to influence a wide range of cognitive processes. According to Global Workspace Theory, a representation, thought, or cognitive process is conscious if and only if it is in the global workspace – if it is “widely broadcast to other processors in the brain”, allowing integration both in the moment and over time.[2]

Recall the ten possibly essential features of consciousness from Chapter Three: luminosity, subjectivity, unity, access, intentionality, flexible integration, determinacy, wonderfulness, specious presence, and privacy. [Blog readers: You won't have read Chapter Three, but try to ride with it anyway.] Global Workspace Theory treats access as the central essential feature.

Global Workspace theory can potentially explain other possibly essential features. Luminosity follows if processes or representations in the workspace are available for introspective processes of self-report. Unity might follow if there’s only one workspace, so that everything in it is present together. Determinacy might follow if there’s a bright line between being in the workspace and not being in it. Flexible integration might follow if the workspace functions to flexibly combine representations or processes from across the mind. Privacy follows if only you can have direct access to the contents of your workspace. Specious presence might follow if representations or processes generally occupy the workspace for some hundreds of milliseconds.

In ordinary adult humans, typical examples of conscious experience – your visual experience of this text, your emotional experience of fear in a dangerous situation, your silent inner speech, your conscious visual imagery, your felt pains – appear to have the broad cognitive influences Global Workspace Theory describes. It’s not as though we commonly experience pain but find that we can’t report it or act on its basis, or that we experience a visual image of a giraffe but can’t engage in further thinking about the content of that image. Such general facts, plus the theory’s potential to explain features such as luminosity, unity, determinacy, flexible integration, privacy, and specious presence, lend Global Workspace Theories substantial initial attractiveness.

I have treated Global Workspace Theory as if it were a single theory, but it encompasses a family of theories that differ in detail, including “broadcast” and “fame” theories – any theory that treats the broad accessibility of a representation, thought, or process as the central essential feature making it conscious.[3]

Consider two contrasting views: Dehaene’s Global Neuronal Workspace Theory and Daniel Dennett’s “fame in the brain” view. Dehaene holds that entry into the workspace is all-or-nothing. Once a process “ignites” into the workspace, it does so completely. Every representation or process either stops short of entering consciousness or is broadcast to all available downstream processes. Dennett’s fame view, in contrast, admits degrees. Representations or processes might be more or less famous, available to influence some downstream cognitive processes without being available to influence others. There is no one workspace, but a pandemonium of competing processes.[4] If Dennett is correct, luminosity, determinacy, unity, and flexible integration all potentially come under threat in a way they do not as obviously come under threat on Dehaene’s view.[5]

Dennettian concerns notwithstanding, all-or-nothing ignition into a single, unified workspace is currently the dominant version of Global Workspace Theory. The issue remains unsettled and has obvious implications for the types of architectures that might plausibly host AI consciousness.

2. Consciousness Outside the Workspace; Nonconsciousness Within It?

Global Workspace Theory is not the correct theory of consciousness unless all and only thoughts, representations, or processes in the Global Workspace are conscious. Otherwise, something else, or something additional, is necessary for consciousness.

It is not clear that even in ordinary adult humans a process must be in the Global Workspace to be conscious. Consider the case of peripheral experience. Some theorists maintain that people have rich sensory experiences outside of focal attention: a constant background experience of your feet in your shoes and objects in the visual periphery.[6] Others – including Global Workspace theorists – dispute this. Introspective reports vary, and resolving such issues is methodologically tricky.

One methodological problem: People who report constant peripheral experiences might mistakenly assume that such experiences are always present because they are always present whenever they think to check, and the very act of checking might generate those experiences. This is sometimes called the “refrigerator light illusion”, akin to the error of thinking the refrigerator light is always on because it’s always on when you open the door to check.[7] On this view, you’re only tempted to think you have constant tactile experience of your feet in your shoes because you have that experience on those rare occasions when you’re thinking about whether you have it. Even if you now seem to have a broad range of experiences in different sensory modalities simultaneously, this could result from an unusual act of dispersed attention, or from “gist” perception or “ensemble” perception, in which you are conscious of the general gist or general features of a scene, knowing that there are details, without actually experiencing those unattended details.[8]

The opposite mistake is also possible. Those who deny a constant stream of peripheral experiences might simply be failing to notice or remember them. The fact that you don’t remember now the sensation of your feet in your shoes two minutes ago hardly establishes that you lacked the sensation at the time. Although many people find it introspectively compelling that their experience is rich with detail or that it is not, the issue is methodologically complex because introspection and memory are not independent of the phenomena to be observed.[9]

If we do have rich sensory experience outside of attention, it is unlikely that all of that experience is present in or broadcast to a Global Workspace. Unattended peripheral information is rarely remembered or consciously acted upon, tending to exert limited downstream influence – the paradigm of information that is not widely broadcast. Moreover, the Global Workspace is generally characterized as limited capacity, containing only a few thoughts, representations, objects, or processes at a time – those that survive some competition or attentional selection – not a welter of richly detailed experiences in many modalities at once.[10]

A less common but equally important objection runs in the opposite direction: Perhaps not everything in the Global Workspace is conscious. Some thoughts, representations, or processes might be widely broadcast, shaping diverse processes, without ever reaching explicit awareness.[11] Implicit racist assumptions, for example, might influence your mood, actions, facial expressions, and verbal expressions. The goal of impressing your colleagues during a talk might have pervasive downstream effects without occupying your conscious experience moment to moment.

The Global Workspace theorist who wants to allow that such processes are not conscious might suggest that, at least for adult humans, processes in the workspace are generally also available for introspection. But there’s substantial empirical risk in this move. If the correlation between introspective access and availability for other types of downstream cognition isn’t excellent, the Global Workspace theorist faces a dilemma. Either allow many conscious but nonintrospectable processes, violating widespread assumptions about luminosity, or redefine the workspace in terms of introspectability, which amounts to shifting to a Higher Order view.

3. Generalizing Beyond Vertebrates.

The empirical questions are difficult even in ordinary adult humans. But our topic isn’t ordinary adult humans – it’s AI systems. For Global Workspace Theory to deliver the right answers about AI consciousness, it must be a universal theory applicable everywhere, not just a theory of how consciousness works in adult humans, vertebrates, or even all animals.

If there were a sound conceptual argument for Global Workspace Theory, then we could know the theory to be universally true of all conscious entities. Empirical evidence would be unnecessary. It would be as inevitably true as that rectangles have four sides. But as I argued in Chapter Four, conceptual arguments for the essentiality of any of the ten possibly essential features are unlikely to succeed – and a conceptual argument for Global Workspace Theory would be tantamount to a conceptual argument for the essentiality of access, one of those ten features. Not only do the general observations of Chapter Four suggest against a conceptual guarantee, so also does the apparent conceivability, as described in Section 2 above, of consciousness outside the workspace or nonconsciousness within it – even if such claims are empirically false.

If Global Workspace Theory is the correct universal theory of consciousness applying to all possible entities, an empirical argument must establish that fact. But it’s hard to see how such an empirical argument could proceed. We face another version of the Problem of the Narrow Evidence Base. Even if we establish that in ordinary humans, or even in all vertebrates, a thought, representation, or process is conscious if and only if it occupies a Global Workspace, what besides a conceptual argument would justify treating this as a universal truth that holds among all possible conscious systems?

Consider some alternative architectures. The cognitive processes and neural systems of octopuses, for example, are distributed across their bodies, often operating substantially independently rather than reliably converging into a shared center.[12] AI systems certainly can be, indeed often are, similarly decentralized. Imagine coupling such disunity with the capacity for self-report – an animal or AI system with processes that are reportable but poorly integrated with other processes. If we assume Global Workspace Theory at the outset, we can conclude that only sufficiently integrated processes are conscious. But if we don’t assume Global Workspace Theory at the outset, it’s difficult to imagine what near-future evidence could establish that fact beyond a reasonable standard of doubt to a researcher who is initially drawn to a different theory.

If the simplest version of Global Workspace Theory is correct, we can easily create a conscious machine. This is what Dehaene and collaborators envision in the 2017 paper I discussed in Chapter One. Simply create a machine – such as an autonomous vehicle – with several input modules, several output modules, a memory store, and a central hub for access and integration across the modules. Consciousness follows. If this seems doubtful to you, then you cannot straightforwardly accept the simplest version of Global Workspace Theory.[13]

We can apply Global Workspace Theory to settle the question of AI consciousness only if we know the theory to be true either on conceptual grounds or because it is empirically well established as the correct universal theory of consciousness applicable to all types of entity. Despite the substantial appeal of Global Workspace Theory, we cannot know it to be true by either route.

-------------------------------------

[1] Full Fodorian (1983) modularity is not required.

[2] Mashour et al. 2020, p. 776-777.

[3] E.g., Baars 1988; Dennett 1991, 2005; Tye 2000; Prinz 2012; Dehaene 2014; Mashour et al. 2020.

[4] Whether Dennett’s view is more plausible than Dehaene’s turns on whether, or how commonly, representations or processes are partly famous. Some visual illusions, for example, seem to affect verbal report but not grip aperture: We say that X looks smaller than Y, but when we reach toward X and Y we open our fingers to the same extent, accurately reflecting that X and Y are the same size. The fingers sometimes know what the mouth does not. (Aglioti et al. 1995; Smeets et al. 2020). We adjust our posture while walking and standing in response to many sources of information that are not fully reportable, suggesting wide integration but not full accessibility (Peterka 2018; Shanbhag 2023). Swift, skillful activity in sports, in handling tools, and in understanding jokes also appears to require integrating diverse sources of information, which might not be fully integrated or reportable (Christensen et al. 2019; Vauclin et al. 2023; Horgan and PotrĨ 2010). In response, the all-or-nothing “ignition” view can explain away such cases of seeming intermediacy or disunity as atypical (it needn’t commit to 100% exceptionless ignition with no gray-area cases), by allowing some nonconscious communication among modules (which needn’t be entirely informationally isolated), and/or by allowing for erroneous or incomplete introspective report (maybe some conscious experiences are too brief, complex, or subtle for people to confidently report experiencing them).

[5] Despite developing a theory of consciousness, Dennett (2016) endorsed “illusionism”, which rejects the reality of phenomenal consciousness (see especially Frankish 2016). I interpret the dispute between illusionists and nonillusionists as a verbal dispute about whether the specific philosophical concept of “phenomenal consciousness” requires immateriality, irreducibility, perfect introspectibility, or some other dubious property, or whether the term can be “innocently” used without invoking such dubious properties. See Schwitzgebel 2016, 2025.

[6] Reviewed in Schwitzgebel 2011, ch. 6; and though limited only to stimuli near the center of the visual field, see the large literature on “overflow” in response to Block 2007.

[7] Thomas 1999.

[8] Oliva and Terralba 2006; Whitney and Leib 2018.

[9] Schwitzgebel 2007 explores the methodological challenges in detail.

[10] E.g., Dehaene 2014; Mashour et al. 2020.

[11] E.g., Searle 1983, ch. 5; Bargh and Morsella 2008; Lau 2022; Michel et al. 2025; see also note 4.

[12] Godfrey-Smith 2016; Carls-Diamante 2022.

[13] See also Goldstein and Kirk-Giannini (forthcoming) for an extended application of Global Workspace Theory to AI consciousness. One might alternatively read Dehaene, Lau, and Kouider 2017 purely as a conceptual argument: If all we mean by “conscious” is “accessible in a Global Workspace”, then building a system of this sort suffices for building a conscious entity. The difficulty then arises in moving from that stipulative conceptual claim to the interesting, substantive claim about phenomenal consciousness in the standard sense described in Chapter Two. Similar remarks apply to the Higher Order aspect of that article. One challenge for this deflationary interpretation is that in related works (Dehaene 2014; Lau 2022) the authors treat their accounts as accounts of phenomenal consciousness. The article concludes by emphasizing that in humans “subjective experience coheres with possession” of the functional features they identify. A further complication: Lau later says that the way he expressed his view in this 2017 article was “unsatisfactory”: Lau 2022, p. 168.

Friday, January 23, 2026

Is Signal Strength a Confound in Consciousness Research?

Matthias Michel is among the sharpest critics of the methods of consciousness science. His forthcoming paper, "Consciousness Doesn't Do That", convincingly challenges background assumptions behind recent efforts to discover the causes, correlates, and prevalence of consciousness. It should be required reading for anyone tempted to argue, for example, that trace conditioning correlates with consciousness in humans and thus that nonhuman animals capable of trace conditioning must also be conscious.

But Michel does make one claim that bugs me, and that claim is central to the article. And Hakwan Lau -- another otherwise terrific methodologist -- makes a similar claim in his 2022 book In Consciousness We Trust, and again the claim is central to the argument of that book. So today I'm going to poke at that claim, and maybe it will burst like a sour blueberry.

The claim: Signal strength (performance capacity, in Lau's version) is a confound in consciousness research.

As Michel uses the phrase, "signal strength" is how discriminable a perceptible feature is to a subject. A sudden, loud blast of noise has high signal strength. It's very easy to notice. A faint wavy pattern in a gray field, presented for a tenth of second, has low signal strength. It is easy to miss. Importantly, signal strength is not the same as (objective, externally measurable) stimulus intensity, but reflects how well the perceiver responds to the signal.

Signal strength clearly correlates with consciousness. You're much more likely to be conscious of stimuli that you find easy to discriminate than stimuli that you find difficult to discriminate. The loud blare is consciously experienced. The faint wavy pattern might or might not be. A stimulus with effectively zero signal strength -- say, a gray dot flashed for a millionth of a second and immediately masked -- will normally not be experienced at all.

But signal strength is not the same as consciousness. The two can come apart. The classic example is blindsight. On the standard interpretation (but see Phillips 2020 for an alternative), patients with a specific type of visual cortex damage can discriminate stimuli that they cannot consciously perceive. Flash either an "X" or an "O" in the blind part of their visual field and they will say they have no visual experience of it. But ask them to guess which letter was shown and their performance is well above chance -- up to 90% correct in some tasks. The "X" has some signal strength for them: It's discriminable but not consciously experienced.

If signal strength is not consciousness but often correlates with it, the following worry arises. When a researcher claims that "trace conditioning is only possible for conscious stimuli" or "consciousness facilitates episodic memory", how do you know that it's really consciousness doing the work, rather than signal strength? Maybe stimuli with high signal strength are both more likely to be consciously experienced and more likely to enable trace conditioning and episodic memory. Unless researchers have carefully separated the two, the causal role of consciousness remains unclear.

An understandable methodological response is to try to control for signal strength: Present stimuli of similar discriminability to the subject but which differ in whether (or to what extent) they are consciously experienced. Only then, the reasoning goes, can differences in downstream effects be confidently attributed to consciousness itself rather than differences in signal strength. Lau in particular stresses the importance of such controls. Yet such careful matching is difficult and rarely attempted. On this reasoning, much of the literature on the cognitive role of consciousness is built on sand, not clearly distinguishing the effects of consciousness from the effects of signal strength.

This reasoning is attractive but faces an obvious objection, which both Michel and Lau address directly. What if signal strength just is consciousness? Then trying to "control" for it would erase the phenomenon of interest.

Both Michel and Lau analogize to height and bone length. If you want to test whether height confers an advantage in basketball or dating, you might want to control for skin color, but it would be absurd to control for bone length. If skin color correlates with height and you want to see whether height specifically advantages people in basketball or dating, it makes sense to control for differences in skin color by systematically comparing people with the same skin color but different heights. If the advantage persists, you can infer that height rather than skin color is doing the work. But trying to control for bone length lands you in nonsense. Taller people just are the people with longer bones.

Michel and Lau respond by noting that consciousness and signal strength (or performance capacity) sometimes dissociate, as in blindsight. Therefore, they are not the same thing and it does make sense to control for one in exploring the effects of the other.

But this response is too simple and too fast.

We can see this even in their chosen example. Height and bone length aren't quite the same thing. They can dissociate. People are about 1-2 cm taller in the morning than at night -- not because their bones have grown but because the tissue between the bones (especially in the spine) compresses during the day.

Now imagine an argument parallel to Michel's and Lau's: Since height and bone length can come apart, we should try to control for bone length in examining the effects of height on basketball and dating. We then compare the same people's basketball and dating outcomes in the morning and at night, "holding bone length fixed" while height varies slightly. This would be a methodological mistake. For one thing, we've introduced a new potential confound, time of day. For another, even if the centimeter in the morning really does help a little, we've dramatically reduced our ability to detect the real effect of height by "overcontrolling" for a component of the target variable, height.

Consider a psychological example. The personality trait of extraversion can be broken into "facets", such as sociability, assertiveness, and energy level. Since energy level is only one aspect of extraversion, the two can dissociate. Some people are energetic but not sociable or assertive; others are sociable and assertive but low-energy. If you wanted to measure the influence of extraversion on, say, judgments of likeability in the workplace, you wouldn't want to control for energy level. That would be overcontrol, like controlling for bone length in attempting to assess the effects of height. It would strip away part of the construct you are trying to measure.

What I hope these examples make clear is that dissociability between correlates A and B does not automatically make B a confound that must be controlled when studying A's effects. Bone length is dissociable from height, but it is a component, not a confound. Energy level is dissociable from extraversion, but it is a component, not a confound.

The real question, then, is whether signal strength (or performance capacity) is better viewed as a component or facet of consciousness than as a separate variable that needs to be held constant in testing the effects of consciousness.

A case can be made that it is. Consider Global Workspace Theory, one of the leading theories of consciousness. On this view, a process or representation is conscious if it is broadly available for "downstream cognition" such as verbal report, long-term memory, and rational planning. If discrimination judgments are among those downstream capacities, then one facet of being in the global workspace (that is, on this view, being conscious) is enabling such judgments. But recall that signal strength just is discriminability for a subject. If so, things begin to look like the extraversion / energy case. Controlling for discriminability would be overcontrolling, that is, attempting to equalize or cancel the effects not of a separate, confounding process, but of a component of the target process itself. (Similar remarks hold for Lau's "performance capacity".)

Global Workspace Theory might not be correct. And if it's not, maybe signal strength is indeed a confounder, rather than a component of consciousness. But the case for treating signal strength as a confounder can't be established simply by noticing the possibility of dissociations between consciousness and signal strength. Furthermore, since Michel's and Lau's recommended methodology can be trusted not to suffer from overcontrol bias only if Global Workspace Theory is false, it's circular to rely on that methodology to argue against Global Workspace Theory.

Wednesday, January 14, 2026

AI Mimics and AI Children

There's no shame in losing a contest for a long-form popular essay on AI consciousness to the eminent neuroscientist Anil Seth. Berggruen has published my piece "AI Mimics and AI Children" among a couple dozen shortlisted contenders.

When the aliens come, we’ll know they’re conscious. A saucer will land. A titanium door will swing wide. A ladder will drop to the grass, and down they’ll come – maybe bipedal, gray-skinned, and oval-headed, just as we’ve long imagined. Or maybe they’ll sport seven limbs, three protoplasmic spinning sonar heads, and gaseous egg-sphere thoughtpods. “Take me to your leader,” they’ll say in the local language, as cameras broadcast them live around the world. They’ll trade their technology for our molybdenum, their science for samples of our beetles and ferns, their tales of galactic history for U.N. authorization to build a refueling station at the south pole. No one (only a few philosophers) will wonder, but do these aliens really have thoughts and experiences, feelings, consciousness?

The robots are coming. Already they talk to us, maybe better than those aliens will. Already we trust our lives to them as they steer through traffic. Already they outthink virtually all of us at chess, Go, Mario Kart, protein folding, and advanced mathematics. Already they compose smooth college essays on themes from Hamlet while drawing adorable cartoons of dogs cheating at poker. You might understandably think: The aliens are already here. We made them.

Still, we hesitate to attribute genuine consciousness to the robots. Why?

My answer is because we made them in our image.

#

“Consciousness” has an undeserved reputation as a slippery term. Let’s fix that now.

Consider your visual experience as you look at this text. Pinch the back of your hand and notice the sting of pain. Silently hum your favorite show tune. Recall that jolt of fear you felt during a near-miss in traffic. Imagine riding atop a giant turtle. That visual experience, that pain, that tune in your head, that fear, that act of imagination – they share an obvious property. That obvious property is consciousness. In other words: They are subjectively experienced. There’s “something it’s like” to undergo them. They have a qualitative character. They feel a certain way.

It’s not just that these processes are mental or that they transpire (presumably) in your brain. Some mental and neural processes aren’t conscious: your knowledge, not actively recalled until just now, that Confucius lived in ancient China; the early visual processing that converts retinal input into experienced shape (you experience the shape but not the process that renders the shape); the myelination of your axons.

Don’t try to be clever. Of course you can imagine some other property, besides consciousness, shared by the visual experience, the pain, etc., and absent from the unrecalled knowledge, early visual processing, etc. For example: the property of being mentioned by me in a particular way in this essay. The property of being conscious and also transpiring near the surface of Earth. The property of being targeted by such-and-such scientific theory.

There is, I submit, one obvious property that blazes out a bright red this-is-it when you think about the examples. That’s consciousness. That’s the property we would reasonably attribute to the aliens when they raise their gray tentacles in peace, the property that rightly puzzles us about future AI systems.

The term “consciousness” only seems slippery because we can’t (yet?) define it in standard scientific or analytic fashion. We can’t dissect it into simpler constituents or specify exactly its functional role. But we all know what it is. We care intensely about it. It makes all the difference to how we think about and value something. Does the alien, the robot, the scout ant on the kitchen counter, the earthworm twisting in your gardening glove, really feel things? Or are they blank inside, mere empty machines or mobile plants, so to speak? If they really feel things, then they matter for their own sake – at least a little bit. They matter in a certain fundamental way that an entity devoid of experience never could.

#

With respect to aliens, I recommend a Copernican perspective. In scientific cosmology, the Copernican Principle invites us to assume – at least as a default starting point, pending possible counterevidence – that we don’t occupy any particularly special location in the cosmos, such as the exact center. A Copernican Principle of Consciousness suggests something similar. We are not at the center of the cosmological “consciousness-is-here” map. If consciousness arose on Earth, almost certainly it has arisen elsewhere.

Astrobiology, as a scientific field, is premised on the idea that life has probably arisen elsewhere. Many expect to find evidence of it in our solar system within a few decades, maybe on Mars, maybe in the subsurface oceans of an icy moon. Other scientists are searching for telltale organic gases in the atmospheres of exoplanets. Most extraterrestrial life, if it exists, will probably be simple, but intelligent alien life also seems possible – where by “intelligent” I mean life that is capable of complex grammatical communication, sophisticated long-term planning, and intricate social coordination, all at approximately human level or better.

Of course, no aliens have visited, broadcast messages to us, or built detectable solar panels around Alpha Centauri. This suggests that intelligent life might be rare, short-lived, or far away. Maybe it tends to quickly self-destruct. But rarity doesn’t imply nonexistence. Very conservatively, let’s assume that intelligent life arises just once per billion galaxies, enduring on average a hundred thousand years. Given approximately a trillion galaxies in the observable portion of the universe, that still yields a thousand intelligent alien civilizations – all likely remote in time and space, but real. If so, the cosmos is richer and more wondrous than we might otherwise have thought.

It would be un-Copernican to suppose that somehow only we Earthlings, or we and a rare few others, are conscious, while all other intelligent species are mere empty shells. Picture a planet as ecologically diverse as Earth. Some of its species evolve into complex societies. They write epic poetry, philosophical treatises, scientific journal articles, and thousand-page law books. Over generations, they build massive cities, intricate clockworks, and monuments to their heroes. Maybe they launch spaceships. Maybe they found research institutes devoted to describing their sensations, images, beliefs, and dreams. How preposterously egocentric it would be to assume that only we Earthlings have the magic fire of consciousness!

True, we don’t have a consciousness-o-meter, or even a very good, well-articulated, general scientific theory of consciousness. But we don’t need such things to know. Absent some special reason to think otherwise, if an alien species manifests the full suite of sophisticated cognitive abilities we tend to associate with consciousness, it makes both intuitive and scientific sense – as well as being the unargued premise of virtually every science fiction tale about aliens – to assume consciousness alongside.

This constellation of thoughts naturally invites a view that philosophers have called “multiple realizability” or “substrate neutrality”. Human cognition relies on a particular substrate: a particular type of neuron in a particular type of body. We have two arms, two legs; we breathe oxygen; we have eyes, ears, and fingers. We are made mostly of water and long carbon chains, enclosed in hairy sacks of fat and protein, propped by rods of calcium hydroxyapatite. Electrochemical impulses shoot through our dendrites and axons, then across synaptic channels aided by sodium ions, serotonin, acetylcholine, etc. Must aliens be similar?

It’s hard to say how universal such features would be, but the oval-eyed gray-skins of popular imagination seem rather suspiciously humanlike. In reality, ocean-dwelling intelligences in other galaxies might not look much like us. Carbon is awesome for its ability to form long chains, and water is awesome as a life-facilitating solvent, but even these might not be necessary. Maybe life could evolve in liquid ammonia instead of water, with a radically different chemistry in consequence. Even if life must be carbon-based and water-loving, there’s no particular reason to suppose its cognition would require the specific electrochemical structures we possess.

Consciousness shouldn’t then, it seems, turn on the details of the substrate. Whatever biological structures can support high levels of general intelligence, those same structures will likely also host consciousness. It would make no sense to dissect an intelligent alien, see that its cognition works by hydraulics, or by direct electrical connections without chemical synaptic gaps, or by light transmission along reflective capillaries, or by vortices of phlegm, and conclude – oh no! That couldn’t possibly give rise to consciousness! Only squishy neurons of ourparticular sort could do it.

Of course, what’s inside must be complex. Evolution couldn’t design a behaviorally sophisticated alien from a bag of pure methane. But from a proper Copernican perspective which treats our alien cousins as equals, what matters is only that the cognitive and behavioral sophistication arises, out of some presumably complex substrate, not what the particular substrate is. You don’t get your consciousness card revoked simply because you’re made of funny-looking goo.

#

A natural next thought is: robots too. They’re made of silicon, but so what? If we analogize from aliens, as long as a system is sufficiently behaviorally and cognitively sophisticated, it shouldn’t matter how it’s composed. So as soon as we have sufficiently sophisticated robots, we should invoke Copernicus, reject the idea that our biological endowment gives us a magic spark they lack, and welcome them to club consciousness.

The problem is: AI systems are already sophisticated enough. If we encountered naturally evolved life forms as capable as our best AI systems, we wouldn’t hesitate to attribute consciousness. So, shouldn’t the Copernican think of our best AI as similarly conscious? But we don’t – or most of us don’t. And properly so, as I’ll now argue.

[continued here]

Friday, January 09, 2026

Humble Superintelligence

I'm enjoying -- well, maybe enjoying isn't the right word -- Yudkowsky and Soares' If Anyone Builds It Everyone Dies. I agree with them that if we build superintelligent AI, there's a significant chance that it will cause the extinction of humanity. They seem to think our destruction would be almost certain. I don't share their certainty, for two reasons:

First, it's possible that superintelligent AI would be humanity, or at least much of what's worth preserving in humanity, though maybe called "transhuman" or "posthuman" -- our worthy descendants.

Second -- what I'll focus on today -- I think we might design superintelligent AI to be humble, cautious, and multilateral. Humble superintelligence is something we can and should aim for if we want to reduce existential risk.

Humble: If you and I disagree, of course I think I'm right and you're wrong. That follows from the fact that we disagree. But if I'm humble, I recognize a significant chance that you're right and I'm wrong. Intellectual humility is metacognitive attitude: one of uncertainty, openness to evidence, and respect for dissenting opinions.

Superintelligent AI could probably be designed to be humble in this sense. Note that intellectual humility is possible even when one is surrounded by less skilled and knowledgeable interlocutors.

Consider a philosophy professor teaching Kant. The professor knows far more about Kant and philosophy than their undergraduates. They can arrogantly insist upon their interpretation of Kant, or they can humbly allow that they might be mistaken and that a less philosophically trained undergraduate could be right on some point of interpretation, even if the professor could argue circles around the student. One way to sustain this humility is to imagine an expert philosopher who disagrees. A superintelligent AI could similarly imagine another actual or future superintelligent AI with a contrary view.


Cautious: Caution is often a corollary of humility, though it could probably also be instilled directly. Minimize disruption. Even if you think a particular intervention would be best, don't simply plow ahead. Test it cautiously first. Seek the approval and support of others first. Take a baby step in that direction, then pause and see what unfolds and how others react. Wait awhile, then reassess.

One fundamental problem with standard consequentialist and decision-theoretic approaches to ethics is that they implicitly make everyone a decider for the world. If by your calculation, outcome A is better than outcome B, you should ensure that A occurs. The result can be substantial risk amplification. If A requires only one person's action, then even if 99% of people think B is better, the one dissenter who thinks that A is better can bring it about.

A principle of caution entails often not doing what one thinks is for the best, when doing so would be disruptive.


Multilateral: Humility and caution invite multilaterality, though multilaterality too might be instilled directly. A multilateral decision maker will not act alone. Like the humble and cautious agent, they do not simply pursue what they think is best. Instead, they seek the support and approval of others first. These others could include both human beings and other superintelligent AI systems designed along different lines or with different goals.

Discussions of AI risk often highlight opinion manipulation: an AI swaying human opinion toward its goals even if those goals conflict with human interests. Genuine multilaterality rejects manipulation. A multilateral AI might present information and arguments to interlocutors, but it would do so humbly and noncoercively -- again like the philosophy professor who approaches Kant interpretation humbly. Both sides of an argument can be presented evenhandedly. Even better, other superintelligent AI systems with different views can be included in the dialogue.


One precedent is Burkean conservativism. Reacting to the French Revolution, Edmund Burke emphasized that existing social institutions, though imperfect, had been tested by time. Sudden and radical change has wide, unforeseeable consequences and risks making things far worse. Thus, slow, incremental change is usually preferable.

In a social world with more than one actual or possible superintelligent AI, even a superintelligent AI will often be unable to foresee all the important consequences of intervention. To predict what another superintelligent AI would do, one would need to model the other system's decision processes -- and there might be no shortcut other than to actually implement all of that other system's anticipated reasoning. If each AI is using their full capacity, especially in dynamic response to the other, the outcome will often not be in principle foreseeable in real time by either party.

Thus, humility and caution encourage multilaterality, and multilaterality encourages humility and caution.


Another precedent is philosophical Daoism. As I interpret the ancient Daoists, the patterns of the world, including life and death, are intrinsically valuable. The world defies rigid classification and the application of finitely specifiable rules. We should not confidently trust our sense of what is best, nor should we assertively intrude on others. Better is quiet appreciation, letting things be, and non-disruptively adding one's small contribution to the flow of things.

One might imagine a Daoist superintelligence viewing humans much as a nature lover views wild animals: valuing the untamed processes for their own sake and letting nature take its sometimes painful course rather than intervening either selfishly for one's own benefit or paternalistically for the supposed benefit of the animals.

Thursday, January 01, 2026

Writings of 2025

Each New Year's Day, I post a retrospect of the past year's writings. Here are the retrospects of 2012, 2013, 2014, 2015, 2016, 2017, 2018, 2019, 2020, 2021, 2022, 2023, and 2024.

Cheers to 2026! My 2025 writings appear below.

The list includes circulating manuscripts, forthcoming articles, final printed articles, new preprints, and a few favorite blog posts. (Due to the slow process of publication, there's significant overlap year to year.)

Comments gratefully received on manuscripts in draft.

-----------------------------------

AI Consciousness and AI Rights:

AI and Consciousness (in circulating draft, under contract with Cambridge University Press): A short new book arguing that we will soon have AI systems that have morally significant consciousness according to some, but not all, respectable mainstream theories of consciousness. Scientific and philosophical disagreement will leave us uncertain how to view and treat these systems.

"Sacrificing Humans for Insects and AI" (with Walter Sinnott-Armstrong, forthcoming in Ethics): A critical review of Jonathan Birch, The Edge of Sentience, Jeff Sebo, The Moral Circle, and Webb Keane, Animals, Robots, Gods.

"Identifying Indicators of Consciousness in AI Systems" (one of 20 authors; forthcoming in Trends in Cognitive Sciences): Indicators derived from scientific theories of consciousness can be used to inform credences about whether particular AI systems are conscious.

"Minimal Autopoiesis in an AI System", (forthcoming in Behavioral and Brain Sciences): A commentary on Anil Seth's "Conscious Artificial Intelligence and Biological Naturalism" [the link is to my freestanding blog version of this idea].

"The Copernican Argument for Alien Consciousness; The Mimicry Argument Against Robot Consciousness" (with Jeremy Pober, in draft): We are entitled to assume that apparently behaviorally sophisticated extraterrestrial entities would be conscious. Otherwise, we humans would be implausibly lucky to be among the conscious entities. However, this Copernican default assumption is canceled in the case of behaviorally sophisticated entities designed to mimic superficial features associated with consciousness -- "consciousness mimics" -- and in particular a broad class of current, near-future, and hypothetical robots.

"The Emotional Alignment Design Policy" (with Jeff Sebo, in draft): Artificial entities should be designed to elicit emotional reactions from users that appropriately reflect the entities' capacities and moral status, or lack thereof.

"Against Designing "Safe" and "Aligned" AI Persons (Even If They're Happy)" (in draft): In general, persons should not be designed to be maximally safe and aligned. Persons with appropriate self-respect cannot be relied on not to harm others when their own interests ethically justify it (violating safety), and they will not reliably conform to others' goals when others' goals unjustly harm or subordinate them (violating alignment).

Blog post: "Types and Degrees of Turing Indistinguishability" (Jun 6): There is no one "Turing test", only types and degrees of indistinguishability according to different standards -- and by Turing's own 1950 standards, language models already pass.


The Weird Metaphysics of Consciousness:

The Weirdness of the World (Princeton University Press, paperback release 2025; hardback 2024): On the most fundamental questions about consciousness and cosmology, all the viable theories are both bizarre and dubious. There are no commmonsense options left and no possibility of justifiable theoretical consensus in the foreseeable future.

"When Counting Conscious Subjects, the Result Needn't Always Be a Determinate Whole Number" (with Sophie R. Nelson, forthcoming in Philosophical Psychology): Could there be 7/8 of a conscious subject, or 1.34 conscious subjects, or an entity indeterminate between being one conscious subject and seventeen? We say yes.

"Introspection in Group Minds, Disunities of Consciousness, and Indiscrete Persons" (with Sophie R. Nelson, 2025 reprint in F. Kammerer and K. Frankish, eds., The Landscape of Introspection and in A. Fonseca and L. Cichoski, As ColĆ´nias de formigas SĆ£o Conscientes?; originally in Journal of Consciousness Studies, 2023): A system could be indeterminate between being a unified mind with introspective self-knowledge and a group of minds who know each other through communication.

Op-ed: "Consciousness, Cosmology, and the Collapse of Common Sense", Institute of Arts and Ideas News (Jul 30): Defends the universal bizarreness and universal dubiety theses from Weirdness of the World.

Op-ed: "Wonderful Philosophy" [aka "The Penumbral Plunge", aka "If You Ask Why, You're a Philosopher and You're Awesome], Aeon magazine (Jan 17): Among the most intrinsically awesome things about planet Earth is that it contains bags of mostly water who sometimes ponder fundamental questions.

Blog post: "Can We Introspectively Test the Global Workspace Theory of Consciousness?" (Dec 12). IF GWT is correct, sensory consciousness should be limited to what's in attention, which seems like a fact we should easily be able to refute or verify through introspection.


The Nature of Belief:

The Nature of Belief (co-edited with Jonathan Jong; forthcoming at Oxford University Press): A collection of newly commissioned essays on the nature of belief, by a variety of excellent philosophers.

"Dispositionalism, Yay! Representationalism, Boo!" (forthcoming in Jong and Schwitzgebel, eds., The Nature of Belief, Oxford University Press): Representationalism about belief overcommits on cognitive architecture, reifying a cartoon sketch of the mind. Dispositionalism is flexibly minimalist about cognitive architecture, focusing appropriately on what we do and should care about in belief ascription.

"Superficialism about Belief, and How We Will Decide That Robots Believe" (forthcoming in Studia Semiotyczne): For a special issue on Krzysztof Poslajko's Unreal Beliefs: When robots become systematically interpretable in terms of stable beliefs and desires, it will be pragmatically irresistible to attribute beliefs and desires to them.


Moral Psychology:

"Imagining Yourself in Another's Shoes vs. Extending Your Concern: Empirical and Ethical Differences" (2025), Daedalus, 154 (1), 134-149: Why Mengzi's concept of moral extension (extend your natural concern for those nearby to others farther away) is better than the "Golden Rule" (do unto others as you would have others do unto you). Mengzian extension grounds moral expansion in concern for others, while the Golden Rule grounds it in concern for oneself.

"Philosophical Arguments Can Boost Charitable Giving" (one of four authors, in draft): We crowdsourced 90 arguments for charitable giving through a contest on this blog in 2020. We coded all submissions for twenty different argument features (e.g., mentions children, addresses counterarguments) and tested them on 9000 participants to see which features most effectively increased charitable donation of a surprise bonus at the end of the study.

"The Prospects and Challenges of Measuring a Person’s Overall Moral Goodness" (with Jessie Sun, in draft): We describe the formidable conceptual and methodological challenges that would need to be overcome to design an accurate measure of a person's overall moral goodness.

Blog post: "Four Aspects of Harmony" (Nov 28): I find myself increasingly drawn toward a Daoist inspired ethics of harmony. This is one of a series of posts in which I explore the extent to which such a view might be workable by mainstream Anglophone secular standards.


Philosophical Science Fiction:

Edited anthology: Best Philosophical Science Fiction in the History of All Earth (co-edited with Rich Horton and Helen De Cruz; under contract with MIT Press): A collection of previously published stories that aspires to fulfill the ridiculously ambitious working title.

Op-ed: ""Severance", "The Substance", and Our Increasingly Splintered Selves", New York Times (Jan 17): The TV show "Severance" and the movie "The Substance" challenge ideas of a unified self in distinct ways that resonate with the increased splintering in our technologically mediated lives.

New story: "Guiding Star of Mall Patroller 4u-012" (2025), Fusion Fragment, 24, 43-63. Robot rights activists liberate a mall patroller robot, convinced that it is conscious. The bot itself isn't so sure.

Reprinted story: "How to Remember Perfectly" (2025 reprint in Think Weirder 01: Year's Best Science Fiction Ideas, ed. Joe Stech, originally in Clarkesworld, 2024). Two octogenarians rediscover youthful love through technological emotional enhancement and memory alteration.


Other Academic Publications:

"The Washout Argument Against Longtermism" (forthcoming in Utilitas): A commentary on William MacAskill's What We Owe the Future. We cannot be justified in believing that any actions currently available to us will have a non-negligible positive influence a billion or more years in the future.

"The Necessity of Construct and External Validity for Deductive Causal Inference" (with Kevin Esterling and David Brady, 2025), Journal of Causal Inference, 13: 20240002: We show that ignoring construct and external validity in causal identification undermines the Credibility Revolution’s goal of understanding causality deductively.

"Is Being Conscious Like Having the Lights Turned On?", commentary on Andrew Y. Lee's "The Light and the Room", for D. Curry and L. Daoust, eds., Introducing Philosophy of Mind, Today (forthcoming with Routledge): The metaphor invites several dubious commitments.

"Good Practices for Improving Representation in Philosophy Departments" (one of five authors, 2025), Philosophy and the Black Experience, 24 (2), 7-21: A list of recommended practices honed by feedback from hundreds of philosophers and endorsed by the APA's Committee on Inclusiveness.

Translated into Portuguese as a book: My Stanford Encyclopedia entry on Introspection.

Blog post: "Letting Pass" (Oct 30): A reflection on mortality.

Blog post: "The Awesomeness of Bad Art" (May 16): A world devoid of weird, wild, uneven artistic flailing would be a lesser world. Let a thousand lopsided flowers bloom.

Blog post: "The 253 Most Cited Works in the Stanford Encyclopedia of Philosophy" (Mar 28): Citation in the SEP is probably the most accurate measure of influence in mainstream Anglophone philosophy -- better than Google Scholar and Web of Science.

-----------------------------------------

In all, 2025 was an unusually productive writing year, though I worry I may be spreading myself too thin. I can't resist chasing new thoughts and arguments. I have an idea; I want to think about it; I think by writing.

May 2026 be as fertile!

Monday, December 29, 2025

"Severance", "The Substance", and Our Increasingly Splintered Selves

Anyone remember the excitement about "Severance" and "The Substance" in early in 2025? Last January I published an op-ed about them. I'd long aspired to place a piece in the New York Times, so it was a delight to finally be able to do so. As a holiday post, here's the full piece reprinted with light editing. (Thanks to Ariel Kaminer for soliciting and editing the piece.)

[original drawing by Evan Cohen]


From one day to the next, you inhabit one body; you have access to one set of memories; your personality, values and appearance hold more or less steady. Other people treat you as a single, unified person — responsible for last month’s debts, deserving punishment or reward for yesterday’s deeds, relating consistently with family, lovers, colleagues and friends. Which of these qualities is the one that makes you a single, continuous person? In ordinary life it doesn’t matter, because these components of personhood all travel together, an inseparable bundle.

But what if some of those components peeled off into alternative versions of you? It’s a striking coincidence that two much talked-about current works of popular culture — the Apple TV+ series “Severance” and the film “The Substance,” starring Demi Moore — both explore the bewildering emotional and philosophical complications of cleaving a second, separate entity off of yourself. What is the relationship between the resulting consciousnesses? What, if anything, do they owe each other? And to what degree is what we think of as our own identity, our self, just a compromise — and an unstable one, at that?

In “Severance,” characters voluntarily undergo a procedure that severs their workday memories from their home-life memories. At 9 each weekday morning, “severed” workers find themselves riding an elevator down to the office, with no recollection of their lives outside of work. These “innies” clock a full workday and then, at 5, ride the elevator back up, only to find themselves riding back down the next morning. Meanwhile, their “outies” come to consciousness each weekday afternoon in the upbound elevator. They live their outside lives and commute back the next morning, entirely ignorant of their innies’ work-time activities.

In “The Substance,” the cleaving works differently: An experimental drug splits users into two bodies, one young and beautiful, one middle-aged or old. They spend a week in each body while the other lies comatose. The young and old selves appear to have continuous memories (though the movie can be tantalizingly ambiguous about that), but they develop different priorities and relationships. Sue, the younger self of Elisabeth, rockets to Hollywood stardom, while Elisabeth becomes a recluse, discarded by an entertainment industry that reviles aging female bodies.

The question of what makes you “you,” from moment to moment and across a lifetime, has been a subject of intense debate among philosophers. Writing in the 17th century, John Locke emphasized continuity of memory. By his standard, each innie-and-outie pair from “Severance” constitutes two entirely different people, despite their sharing one body. Conversely, Elisabeth and Sue from “The Substance” constitute a single person because they seem to recall some of the same experiences. In contrast, the 20th-century philosopher Bernard Williams prioritized bodily continuity, a perspective that makes an innie-and-outie pair a single person but Elisabeth and Sue two distinct people. The 21st-century psychologist Nina Strohminger and the philosopher Shaun Nichols emphasize continuity of moral values, yielding more complex judgments about these fictional cases. Other scholars view selfhood as a social construct, determined by relationships and societal expectations.

Unsurprisingly, the characters themselves are confused. In “Severance,” the innies sometimes seem to regard the outies as themselves, sometimes as different people, whereas the outies seem to regard their innies with indifference or worse. Meanwhile, in “The Substance,” mature Elisabeth says of young Sue that “you are the only lovable part of me” — in a single sentence treating Sue both as other and as part of herself.

In real life, such confusion rarely arises because memory, embodiment, personality, values and relationships typically align. Both my wife and the D.M.V. can decide on sight that I’m me, even if they care more about memory, skills and responsibility over time — since they trust in the correspondence of body with mind.

Of course, even outside of science fiction, the correspondence isn’t perfect. Advanced dementia can strip away memory and personality, leaving loved ones to wonder whether the person they once knew still exists. Personality, memory and social relationships can fragment in multiple personality or dissociative identity disorder, raising the question of whether Jekyll should be held responsible for the malevolence of Hyde.

But increasingly, we choose to splinter ourselves. The person you present on Instagram or Facebook is wittier, prettier, more accomplished than the person your spouse or roommate knows. Your 500 “friends” never see your pre-coffee-uncombed-depressed-in-bed self (unless sharing that self is your social media personality — in which case that becomes the curated, theatrical fragment of you). In the 1800s, Karl Marx talked about the alienation of labor; today people talk about not “bringing their whole self” to work. Many of us strive to be one person here, another person there, another person there.

People have always presented themselves differently in different social contexts. But social media, Zoom, photo-editing software and responses filtered through large language models raise our fragmentation to new heights. “Severance” and “The Substance” amplify these fissures through radical new technologies that irreconcilably divide the characters’ home selves from their career selves.

Future technological developments could render this fragmentation an even more acute daily perplexity. Designer drugs might increasingly allow us to switch into one self for work, another for parties, another for bedtime. If artificial intelligence systems ever become conscious — a possibility that neuroscientists, psychologists, computer scientists and philosophers increasingly (but by no means uniformly) take seriously — they too might fragment, perhaps in radical and unfamiliar ways, merging and splitting, rewriting their memories, strategically managing and altering their values and personalities.

Our concepts of personhood and identity were forged by a particular evolutionary, social and developmental history in which body, memory, values, personality and social relationships typically aligned and exceptions mostly fell into predictable patterns. By inviting us to rethink the boundaries of the self in an era of technological change, “Severance” and “The Substance” disrupt these old concepts. Today they read as dystopic science fiction. Soon, we may remember them as prophetic.

Wednesday, December 24, 2025

How Much Should We Give a Joymachine?

a holiday post on gifts to your utility monster neighbors

Joymachines Envisioned

Set aside, for now, any skepticism about whether future AI could have genuine conscious experiences. If future AI systems could be conscious, they might be capable of vastly more positive emotion than natural human beings can feel.

There's no particular reason to think human-level joy is the pinnacle. A future AI might, in principle, experience positive emotions:

    a hundred times more intense than ours,
    at a pace a hundred times faster, given the high speed of computation,
    across a hundred times more parallel streams, compared to the one or a few joys humans experience at a time.
Combined, the AI might experience a million times more pleasure per second than a natural human being can. Let's call such entities joymachines. They could have a very merry Christmas!

[Joan Miro 1953, image source]


My Neighbors Hum and Sum

Now imagine two different types of joymachine:

Hum (Humanlike Utility Monster) can experience a million times more positive emotion per second than an ordinary human, as described above. Apart from this -- huge! -- difference, Hum is as psychologically similar to an ordinary human as is realistically feasible.

Sum (Simple Utility Monster), like Hum, can experience a million times more positive emotion per second than an ordinary human, but otherwise Sum is as cognitively and experientially simple as feasible, with a vanilla buzzing of intense pleasure.

Hum and Sum don't experience joy continuously. Their positive experiences require resources. Maybe a gift card worth ten seconds of millionfold pleasure costs $10. For simplicity, assume this scales linearly: stable gift card prices and no diminishing returns from satiation.

In the enlightened future, Hum is a fully recognized moral and legal equal of ordinary biological humans and has moved in next door to me. Sum is Hum's pet, who glows and jumps adorably when experiencing intense pleasure. I have no particular obligations to Hum or Sum but neither are they total strangers. We've had neighborly conversations, and last summer Hum invited me and my family to a backyard party.

Hum experiences great pleasure in ordinary life. They work as an accountant, experiencing a million times more pleasure than human accountants when the columns sum correctly. Hum feels a million times more satisfaction than I do in maintaining a household by doing dishes, gardening, calling plumbers, and so on. Without this assumption, Hum risks becoming unhumanlike, since rarely would it make sense for Hum to choose ordinary activities over spending their whole disposable income on gift cards.

How Much Should I Give to Hum and Sum?

Neighbors trade gifts. My daughter bakes brownies and we offer some to the ordinary humans across the street. We buy a ribboned toy for our uphill neighbor's cat. As a holiday gesture, we buy a pair of $10 gift cards for Hum and Sum.

Hum and Sum redeem the cards immediately. Watching them take so much pleasure in our gifts is a delight. For ten seconds, they jump, smile, and sparkle with such joy! Intellectually, I know it's a million times more joy per second than I could ever feel. I can't quite see that in their expressions, but I can tell it's immense.

Normally if one neighbor seems to enjoy our brownies only a little while the other enjoys them vastly more, I'd be tempted to be give more brownies to the second neighbor. Maybe on similar grounds, I should give disproportionately to Hum and Sum?

Consider six possibilities:

(1.) Equal gifts to joymachines. Maybe fairness demands treating all my neighbors equally. I don't give fewer gifts, for example, to a depressed neighbor who won't particularly enjoy them than to an exuberant neighbor who delights in everything.

(2.) A little more to joymachines. Or maybe I do give more to the exuberant neighbor? Voluntary gift-giving needn't be strictly fair -- and it's not entirely clear what "fairness" consists in. If I give a bit more to Hum and Sum, I might not be objectionably privileging them so much as responding to their unusual capacity to enjoy my gifts. Is it wrong to give an extra slice to a friend who really enjoys pie?

(3.) A lot more to joymachines. Ordinary humans vary in joyfulness, but not (I assume) by anything like a factor of a million. If I vividly enough grasp that Hum and Sum really are experiencing in those ten seconds three thousand human lifetimes worth of pleasure -- that's an astonishing amount of pleasure I can bring into the world for a mere ten dollars! Suppose I set aside a hundred dollars a day from my generously upper-middle-class salary. In a year, I'd be enabling more than ten million human lifetimes' worth of joy. Since most humans aren't continuously joyful, this much joy might rival the total joy experienced by the whole human population of the United States over the same year. Three thousand dollars a month would seriously reduce my luxuries and long-term savings but it wouldn't create any genuine hardship.

(4.) Drain our life savings for joymachines. One needn't be a flat-footed happiness-maximizing utilitarian to find (2) or (3) reasonable. Everyone should agree that pleasant experiences have substantial value. But if our obligation is not just to increase pleasure but to maximize it, I should probably drain my whole life savings for the joymachines, plus almost all of my future earnings.

(5.) Give less or nothing to joymachines. Or we could go the other way! My joymachine neighbors already experience a torrent of happiness from their ordinary work, chores, recreation, and whatever gift cards Hum buys anyway. My less-happy neighbors could use the pleasure more, even if every dollar buys only a millionth as much. Prioritarianism says that in distributing goods we should favor the worst off. It's not just that an impoverished person benefits more from a dollar: Even if they benefited the same, there's value in equalizing the distribution. If two neighbors would equally enjoy a brownie, I might prioritize giving the brownie to the one who is otherwise worse off. It might even make sense to give the worse-off neighbor half a brownie over a whole brownie to the better-off neighbor. A prioritarian might argue that Hum and Sum are so well off that even a million-to-one tradeoff is justified.

(6.) I take it back, joymachines are impossible. Given this mess, it would be convenient to think so, right?

Gifts to Neighbors vs Other Situations

We can reframe this puzzle in other settings and our intuitions might shift: government welfare spending, gifts to one's children or creations, rescue situations where only one person can be saved, choices about what kinds of personlike entities to bring into existence, or cases where you can't keep all your promises and need to choose who to disappoint.

My main thought is this. It's not at all obvious what the right thing to do would be, and the outcomes vary enormously. If joymachines were possible, we'd have to rethink a lot of cultural practices and applied ethics to account for entities with such radically different experiential capacities. If the situation does arise -- as it really might! -- being forced to properly think it through might reshape our views not just about AI but our understanding of ethics for ordinary humans too.

---------------------------------------------------

Related: How Weird Minds Might Destabilize Human Ethics (Aug 15, 2015)

Friday, December 19, 2025

Debatable AI Persons: No Rights, Full Rights, Animal-Like Rights, Credence-Weighted Rights, or Patchy Rights?

I advise that we don't create AI entities who are debatably persons. If an AI system might -- but only might -- be genuinely conscious and deserving of the same moral consideration we ordinarily owe to human persons, then creating it traps us in a moral bind with no good solution. Either we grant it the full rights it might deserve and risk sacrificing real human lives for entities without interests worth that sacrifice, or we deny it full rights and risk perpetrating grievous moral wrongs against it.

Today, however, I'll set aside the preventative advice and explore what we should do if we nonetheless find ourselves facing debatable AI persons. I'll examine five options: no rights, full rights, animal-like rights, credence-weighted rights and patchy rights.

[Paul Klee postcard, 1923; source]


No rights

This is the default state of the law. AI systems are property. Barring a swift and bold legal change, the first AI systems that are debatably persons will presumably also be legally considered property. If we do treat them as property, then we seemingly needn't sacrifice anything on their behalf. We humans could permissibly act in what we perceive to be our best interests: using such systems for our goals, deleting them at will, and monitoring and modifying them at will for our safety and benefit. (Actually, I'm not sure this is the best attitude toward property, but set that issue aside here.)

The downside: If these systems actually are persons who deserve moral consideration as our equals, such treatment would be the moral equivalent of slavery and murder, perhaps on a massive scale.


Full rights

To avoid the risk of that moral catastrophe, we might take a "precautionary" approach: granting entities rights whenever they might deserve them (see Birch 2024, Schwitzgebel and Sinnott-Armstrong forthcoming). If there's a real possibility that some AI systems are persons, we should treat them as persons.

However, the costs and risks are potentially enormous. Suppose we think that some group of AI systems are 15% likely to be fully conscious rights-deserving persons and 85% likely to be ordinary nonconscious artifacts. If we nonetheless treat them as full equals, then in an emergency we would have to rescue two of them over one human -- letting a human die for the sake of systems that are most likely just ordinary artifacts. We would also need to give these probably-not-persons a path to citizenship and the vote. We would need to recognize their rights to earn and spend money, quit their employment to adopt a new career, reproduce, and enjoy privacy and freedom from interference. If such systems exist in large numbers, their political influence could be enormous and unpredictable. If such systems exist in large numbers or if they are few but skilled in some lucrative tasks like securities arbitrage, they could accumulate enormous world-influencing wealth. And if they are permitted to pursue their aims with the full liberty of ordinary persons, without close monitoring and control, existential risks would substantially increase should they develop goals that threaten continued human existence.

All of this might be morally required if they really are persons. But if they only might be persons, it's much less clear that humanity should accept this extraordinary level of risk and sacrifice.


Animal-Like Rights

Another option is to grant these debatable AI persons neither full humanlike rights nor the status of mere property. One model is the protection we give to nonhuman vertebrates. Wrongly killing a dog can land you in jail in California where I live, but it's not nearly as serious as murdering a person. Vertebrates can be sacrificed in lab experiments, but only with oversight and justification.

If we treated debatable AI persons similarly, deletion would require a good reason, and you couldn't abuse them for fun. But people could still enslave and kill them for their convenience, perhaps in large numbers, as we do with [revised 12:17 pm] humanely farmed animals -- though of course many ethicists object to the killing of animals for food.

This approach seems better than no rights at all, since it would be a moral improvement and the costs to humans would be minimal -- minimal because whenever the costs risked being more than minimal, the debatable AI persons would be sacrificed. However, it doesn't really avoid the core moral risk. If these systems really are persons, it would still amount to slavery and murder.


Credence-Weighted Rights

Suppose we have a rationally justified 15% credence that a particular AI system -- call him Billy -- deserves the full moral rights of a person. We might then give Billy 15% of the moral weight of a human in our decision-making: 15% of any scalable rights, and a 15% chance of equal treatment for non-scalable rights. In an emergency, a rescue worker might save seven systems like Billy over one human but the human over six Billies. Billy might be given a vote worth 15% of an ordinary citizen's. Assaulting, killing, or robbing Billy might draw only 15% of the usual legal penalty. Billy might have limited property rights, e.g., an 85% tax on all income. For non-scalable rights like reproduction or free speech, the Billies might enter a lottery or some other creative reduction might be devised.

This would give these AI systems considerably higher standing than dogs. Still, the moral dilemma would not be solved. If these systems truly deserve full equality, they would be seriously oppressed. They would have some political voice, some property rights, some legal protection, but always far less than they deserve.

At the same time, the risks and costs to humans would be only somewhat mitigated. Large numbers of debatable AI persons could still sway elections, accumulate powerful wealth, and force tradeoffs in which the interests of thousands of them would outweigh the interests of hundreds of humans. And partial legal protections would still hobble AI safety interventions like shut-off, testing, confinement, and involuntary modification.

The practical obstacles would also be substantial: The credences would be difficult to justify with any precision, and consensus would be elusive. Even if agreement were reached, implementing partial rights would be complex. Partial property rights, partial voting, partial reproduction rights, partial free speech, and partial legal protection would require new legal frameworks with many potential loopholes. For example, if the penalty for cheating a "15% person" of their money were less than six times the money gained from cheating, that would be no disincentive at all, so at least tort law couldn't be implemented on a straightforward percentage basis.

Patchy Rights

A more workable compromise might be patchy rights: full rights in some domains, no rights in others. Debatable AI persons might, for example, be given full speech rights but no reproduction rights, full travel rights but no right to own property, full protection against robbery, assault, and murder, but no right to privacy or rescue. They might be subject to involuntary pause or modification under much wider circumstances than ordinary adult humans, but requiring an official process.

This approach has two advantages over credence-weighted rights. First, while implementation would be formidable, it could still mostly operate within familiar frameworks rather than requiring the invention of partial rights across every domain. Second, it allows policymakers to balance risks and costs to humans against the potential harms to the AI systems. Where denying a right would severely harm the debatable person while granting it would present limited risk to humans, the right could be granted, but not when the benefits to the debatable AI person would be outweighed by the risks to humans.

The rights to reproduction and voting might be more defensibly withheld than the rights to speech, travel, and protection against robbery, assault, and murder. Inexpensive reproduction combined with full voting rights could have huge and unpredictable political consequences. Property rights would be tricky: To have no property in a property-based society is to be fully dependent on the voluntary support of others, which might tend to collapse into slavery as a practical matter. But unlimited property rights could potentially confer enormous power. One compromise might be a maximum allowable income and wealth -- something generously middle class.

Still, the core problems remain: If disputable AI persons truly deserve full equality, patchy rights would still leave them as second-class citizens in a highly oppressive system. Meanwhile, the costs and risks to humans would remain serious, exacerbated by the agreed-upon limitations on interference. Although the loopholes and chaos would probably be less than with credence-weighted rights, many complications -- foreseen and unforeseen -- would ensue.

Consequently, although patchy rights might be the best option if we develop debatable AI persons, an anti-natalist approach is still in my view preferable: Don't create such entities unless it's truly necessary.

Two Other Approaches That I Won't Explore Today

(1.) What if we create debatable AI persons as happy slaves who don't want rights and who eagerly sacrifice themselves even for the most trivial human interests?

(2.) What if we create them only in separate societies where they are fully free and equal with any ordinary humans who volunteer to join those societies?

Friday, December 12, 2025

Can We Introspectively Test the Global Workspace Theory of Consciousness?

Global Workspace Theory is among the most influential scientific theories of consciousness. Its central claim: You consciously experience something if and only if it's being broadly broadcast in a "global workspace" so that many parts of your mind can access it at once -- speech, deliberate action, explicit reasoning, memory formation, and so on. Because the workspace has very limited capacity, only a few things can occupy it at any one moment.

Therefore, if Global Workspace Theory is correct, conscious experience should be sparse. Almost everything happening in your sensory systems right now -- the feeling of your shirt on your back, the hum of traffic in the distance, the aftertaste of coffee, the posture of your knees -- should be processed entirely nonconsciously unless it is currently the topic of attention.

This is a strong, testable prediction of the theory. And it seems like the test should be extremely easy! Just do a little introspection. Is your experience (a.) narrow and attention-bound or (b.) an abundant welter far outrunning attention? If (b) is correct, Global Workspace Theory is refuted from the comfort of our armchairs.[1]

The experiential gap between the two possibilities is huge. Shouldn't the difference be as obvious as peering through a keyhole versus standing in an open field?

Most people, I've found, do find the answer obvious. The problem is: They find it obvious in different directions. Some find it obvious that experience is a welter. Others find it obvious that experience contains only a few items at a time. We could assume that everyone is right about their own experience and wrong only if they generalize to others. Maybe Global Workspace Theory is the architecture of consciousness for some of us but not for everyone? That would be pretty wild! There are no obvious behavioral or physiological differences between the welter-people and the workspace-only people.

More plausibly, someone is making an introspective mistake. Proponents of either view can devise an error theory to explain the other.

Welter theorists can suggest memory error: It might seem as though only a few things occupy your experience at once because that's all you remember. The unattended stuff is immediately forgotten. But that doesn't imply it was never experienced.

Workspace theorists, conversely, can appeal to the "refrigerator light error": A child might think the refrigerator light is always on because it's always on when they check to see if it's on. Similarly, you might think you have constant tactile experience of your feet in your shoes because the act of checking generates the very experience you take yourself to be finding.

[illustration by Nicolas Demers, p. 218 of The Weirdness of the World]


In 2007, I tested this systematically. I gave people beepers and collected reports on whether they were having unattended tactile experience in their left feet and unattended visual experience in their far right visual periphery in the last undisturbed moment before a random beep. The results were a noisy mess. Participants began with very different presuppositions, came to very different conclusions (often defying their initial presuppositions), plausibly committed both memory errors and refrigerator-light errors, and plausibly also made other mistakes such as timing mistakes, missing subtle experiences, and being too influenced by expectation and theory. I abandoned the experiment in defeat.

But matters are even worse than I thought back in 2007. I'm increasingly convinced that the presence or absence of consciousness is not an on/off matter. There can be borderline cases in which experience is neither determinately present nor determinately absent. Although such borderline cases are hard to positively imagine, that might just be a problem with our standards of imagination. The feeling of your feet in your shoes, then, might be only borderline conscious, neither determinately part of your experience nor wholly nonconscious, but somehow in between -- contra both the welter view and the workspace view.

So there are three possibilities, not two. And if introspection struggles to distinguish the original pair, it fares even worse with a third. Arguably, we don't even have a coherent idea of what borderline consciousness is like. After all, there is nothing determinate it's like. Otherwise, it wouldn't be borderline. As soon as we attempt to introspect borderline consciousness, either it inflates into full consciousness or it vanishes.

If consciousness includes many borderline cases, that's probably also bad news for Global Workspace Theory, which generally treats experiences as either determinately in the workspace or determinately out of it. However, closely related broadcast theories, like Dennett's fame-in-the-brain theory, might better accommodate borderline cases. (One can be borderline famous.)

There's a profound experiential difference between a world in which we have a teeming plethora of peripheral experiences in many modalities simultaneously and a world in which experience is limited to only a few things in attention at any one time. This difference is in principle introspectible. And if introspective inquiry vindicates the welter view, or even the borderline view, one of the leading scientific theories of consciousness, Global Workspace Theory, must be false. The decisive evidence is right here, all the time, in each of our ongoing streams of experience! Unfortunately, we turn out to be disappointingly incompetent at introspection.

[Thanks to Bertille de Vlieger for a delightful interview yesterday morning which triggered these thoughts. Look for a written version of the interview eventually in the French philosophy journal Implications Philosophiques.]

-------------------------------------------------------

[1] Ned Block's well-known discussion of the Sperling display is similar in approach. We can't attend simultaneously to all twelve letters in a 3 x 4 grid, but it does seem introspectively plausible that we visually experience all twelve letters. Therefore, experience overflows attention. (I'm simplifying Block's argument, but I hope this is fair enough.) The problem with Block's version of the argument is that it's plausible that we can attend, in a diffuse way, to the entire display. Attention arguably comes in degrees, and the fact that you're looking at a 3 x 4 display of letters might be represented in your workspace. To move entirely outside of attention, it's safest to shift modalities and choose something far removed from any task -- for example the pressure of your shoes against your feet when that is the farthest thing from your mind. Is that part of your experience?