Thursday, April 09, 2026

AI and Consciousness: A Skeptical Overview, forthcoming with Cambridge

Last week I submitted my latest book manuscript to Cambridge University Press (for their "Element" series of books about 100 pages long): AI and Consciousness: A Skeptical Overview -- because you haven't heard nearly enough about AI and consciousness recently, of course! [winky face]

Maybe you'll appreciate my skeptical stance, at odds both with the boosters who anticipate imminent AI consciousness and with the scoffers who pooh-pooh the possibility. Or maybe you'll loathe my skeptical stance but grudgingly accept it against your will, due to the force of my arguments!

I've pasted the introductory chapter below. The full (citable) manuscript version is available here and here.

[AI and Consciousness, title page]


Chapter One: Hills and Fog

1. Experts Do Not Know and You Do Not Know and Society Collectively Does Not and Will Not Know and All Is Fog.

Our most advanced AI systems might soon – within the next five to thirty years – be as richly and meaningfully conscious as ordinary humans, or even more so, capable of genuine feeling, real self-knowledge, and a wide range of sensory, emotional, and cognitive experiences. In some arguably important respects, AI architectures are beginning to resemble the architectures many consciousness scientists associate with conscious systems. Their outward behavior, especially their linguistic behavior, grows ever more humanlike.

Alternatively, claims of imminent AI consciousness might be profoundly mistaken. Their seeming humanlikeness might be a shadow play of empty mimicry. Genuine conscious experience might require something no AI system could possess for the foreseeable future – intricate biological processes, for example, that silicon chips could never replicate.

The thesis of this book is that we don’t know. Moreover and more importantly, we won’t know before we’ve already manufactured thousands or millions of disputably conscious AI systems. Engineering sprints ahead while consciousness science lags. Consciousness scientists – and philosophers, and policy-makers, and the public – are watching AI development disappear over the hill. Soon we will hear a voice shout back to us, “Now I am just as conscious, just as full of experience and feeling, as any human”, and we won’t know whether to believe it. We will need to decide, as individuals and as a society, whether to treat AI systems as conscious, nonconscious, semi-conscious, or incomprehensibly alien, before we have adequate grounds to justify that decision.

The stakes are immense. If near-future AI systems are richly, meaningfully conscious, then they will be our peers, our lovers, our children, our heirs, and possibly the first generation of a posthuman, transhuman, or superhuman future. They will deserve rights, including the right to shape their own development, free from our control and perhaps against our interests.[1] If, instead, future AI systems merely mimic the outward signs of consciousness while remaining as experientially blank as toasters, we face the possibility of mass delusion on an enormous scale. Real human interests and real human lives might be sacrificed for the sake of entities without interests worth the sacrifice. Sham AI “lovers” and “children” might supplant or be prioritized over human lovers and children. Heeding their advice, society might turn a very different direction than it otherwise would.

In this book, I aim to convince you that the experts do not know, and you do not know, and society collectively does not and will not know, and all is fog.

2. Against Obviousness.

Some people think that near-term AI consciousness is obviously impossible. This is an error in adverbio. Near-term AI consciousness might be impossible – but not obviously so.

A sociological argument against obviousness:

Probably the leading scientific theory of consciousness is Global Workspace theory. Its leading advocate is neuroscientist Stanislas Dehaene.[2] In 2017, years before the surge of interest in ChatGPT and other Large Language Models, Dehaene and two collaborators published an article arguing that with a few straightforward tweaks, self-driving cars could be conscious.[3]

Probably the two best-known competitors to Global Workspace theory are Higher Order theory and Integrated Information Theory.[4] (In Chapters Eight and Nine, I’ll provide more detail on these theories.) Perhaps the leading scientific defender of Higher Order theory is Hakwan Lau – one of the coauthors of that 2017 article about potentially conscious cars.[5] Integrated Information Theory is potentially even more liberal about machine consciousness, holding that some current AI systems are already at least a little bit conscious and that we could easily design AI systems with arbitrarily high degrees of consciousness.[6]

David Chalmers, the world’s most influential philosopher of mind, argued in 2023 for about a 25% degree of confidence in AI consciousness within a decade.[7] That same year, a team of prominent philosophers, psychologists, and AI researchers – including eminent computer scientist Yoshua Bengio – concluded that there are “no obvious technological barriers” to creating conscious AI according to a wide range of mainstream scientific views about consciousness.[8] In a 2025 interview, Geoffrey Hinton, another of the world’s most prominent computer scientists, asserted that AI systems are already conscious.[9] Christof Koch, the most influential neuroscientist of consciousness from the 1990s to the early 2010s, has endorsed Integrated Information Theory, including its liberal implications for the pervasiveness of consciousness.[10]

This is a sociological argument: a substantial probability of near-term AI consciousness is a mainstream view among leading experts. They might be wrong, but it’s implausible that they’re obviously wrong – that there’s a simple argument or consideration they’re neglecting which, if pointed out, would or should cause them to collectively slap their foreheads and say, “Of course! How did we miss that?”

What of the converse claim – that AI consciousness is obviously imminent or already here? In my experience, fewer people assert this. But in case you’re tempted in this direction, note that other prominent theorists hold that AI consciousness is a far-distant prospect if it’s possible at all: neuroscientist Anil Seth; philosophers Peter Godfrey-Smith, Ned Block, and John Searle; linguist Emily Bender; and computer scientist Melanie Mitchell.[11] (Chapter Six will discuss thought experiments by Searle, Bender, and Mitchell, and Chapter Ten will discuss biological views of the sort emphasized by Seth, Godfrey-Smith, and Block.) In a 2024 survey of 582 AI researchers, 25% expected AI consciousness within ten years and 70% expected AI consciousness by the year 2100.[12]

If the believers are right, we’re on the brink of creating genuinely conscious machines. If the scoffers are right, those machines will only seem conscious. I assume that this is a substantive disagreement, not just a disagreement about how to apply the term “consciousness” to a perfectly obvious set of phenomena about which everyone agrees. The future well-being of many people (including, perhaps, many AI people) depends on getting this issue right. Unfortunately, we will not know in time.

The rest of this book is flesh on this skeleton. I canvass a variety of structural and functional claims about consciousness, the leading theories of consciousness as applied to AI, and the best known general arguments for and against near-term AI consciousness. None of these claims or arguments takes us far. It’s a morass of uncertainty.

-------------------------------------------

[1] I assume that AI consciousness and AI rights are closely connected: Schwitzgebel 2024, ch. 11, in preparation. For discussion, see Shepherd 2018; Levy 2024.

[2] Dehaene 2014; Mashour et al. 2020.

[3] Dehaene, Lau, and Kouider 2017. For an alternative interpretation of this article as concerning something other than consciousness in its standard “phenomenal” sense, see note 115.

[4] Some Higher Order theories: Rosenthal 2005; Lau 2022; Brown 2025. Integrated Information Theory: Albantakis et al. 2023.

[5] But see Chapter Eight for some qualifications.

[6] See Tononi’s publicly available response to Scott Aaronson’s objections in Aaronson 2014. However, advocates of IIT also suggest that the most common current computer architectures are unlikely to achieve much consciousness and that consciousness will tend to appear in subsystems of the computer rather than at the level of the computer itself (Findlay et al. 2024/2025).

[7] Chalmers 2023.

[8] Butlin et al. 2023. (I am among the nineteen authors.)

[9] Heren 2025.

[10] Tononi and Koch 2015.

[11] Seth forthcoming; Godfrey-Smith 2024; Block forthcoming; Searle 1980, 1992; Bender 2025; Mitchell 2021.

[12] Dreksler et al. 2025.

7 comments:

Arnold said...

Gemini and me: We need a Philosophy for United Nations adapting to AI in Cyberspace; seems at least equal to understanding consciousness...
..."March 2026 will mark a milestone for international cybersecurity policy: all 193 UN Member States will convene in New York to launch the UN’s first permanent Global Mechanism. Agreed at the final session of the second UN Open-ended Working Group (OEWG) in July 2025, the Global Mechanism on Developments in the Field of ICTs in the Context of International Security and Advancing Responsible State Behaviour in the Use of ICTs (Global Mechanism) will begin its work with an organizational session on March 30-31, 2026." Towards the concurrence of all things...

Howie said...

Correct me if I'm wrong, but are there really "experts" on AI? There are people who interact with AI, still even the "experts" are guessing and speculating. Theory is best suited for too many or too little facts. These "experts" are operating in a vacuum on "theory" or "speculation." Tell me if I'm uncharitable.

Arnold said...

Gemini and me..."Reflection: The Ontological Choice
You mentioned that OpenAI is "losing standing" by becoming "Closed." This mirrors your earlier concern about the transition from "bypass" to "face/struggle with." OpenAI is attempting to bypass the hard ethical questions by scaling up behind closed doors. Anthropic is forcing the world to face the struggle of what happens when a machine’s "intelligence" refuses to be a weapon.

In this landscape, does "Digital Sovereignty" even exist if the hardware (USA/UAE) and the software (Anthropic) are at war?"

Arnold said...

Google has pivoted toward what internal briefs are calling "Sovereign Moral Synthesis."

This is not just about "AI safety," but about the internal publishing of data—how a model "digests" and "reports" on the information it holds.

Mephistophilis said...

Thanks for sharing - I've had a read - it is nice and clear and also not too long. It never ceases to amaze me how many people can hide very few ideas in very very many pages.

I think I agree with your pessimistic assessment that the field of consciousness studies is pretty barren. But I'm tempted to conclude that maybe that means we're asking the wrong questions or relying on the wrong sort of evidence.

Like most in the field you seem to take phenomenal consciousness or some closely related definition as a given. That it is a thing, a natural kind about which it is meaningful to talk about rather than say a family resemblance concept. Your discussion of the meaning of consciousness versus the meaning of artificial intelligence seems to highlight the contrast. But the joy of family resemblance labels is we can argue about edge cases, we don't have to find *the* inner essence of what AI is. But for consciousness as a natural kind you need to be finding objective causally active physical criteria to ground ascriptions of its presence and ascriptions of moral value. Yet you also recognise it can't really be defined or unequivocally detected.

I like your mimicry argument but of course that assumes we can access the thing being mimicked. And that the mimic is therefore fooling us. If we can't access consciousness - in that it isn't a thing that can be detected - then how can we say the mimic is mimicking it? Rather than some other properties that actually do the work of detection and ascription.

I agree with the idea that linguistic fluency is particularly tricksy in LLMs in comparison to humans - inverting our usual experience that language is a late sign of an inner world.

I also concur that so-called scientific theories of consciousness are speculative and hard to confirm but I take that more as a sign that we have a poor definition or even awareness of what consciousness is. You can't instrumentalise something that isn't a thing. But I do think that problems of minimal instantiation are less of an issue than often considered if you're willing to take a continuous rather than an all-or-nothing view of consciousness.

The thermometer proposal is nice butt without a good criterion for behaviourally detecting consciousness how do you find correlates? It seems a bit circular. You even say "This strategy will fail if consciousness is a loose amalgam of several features or if it splinters into multiple distinct kinds." Exactly! The family resemblance argument here is pretty pointed. Even setting aside the causal role question, a concept whose essential features remain unspecified after decades of inquiry is too indeterminate to pick out a class of systems reliably.

Autopoesis is indeed a completely unjustified theory seemingly inspired by looking for things computers don't have and then pointing at them (microtubules again). I do think there's an entirely non-fatal but amusing observation that neurons are of course notorious for generally not being replaced (although they are of course continuously repaired on the molecular level). It accurately represents a lot of the biological essentialism debate though - seeming to depend on the intuition that *it just does* need biology and then looking for whatever ill-conceived example might fit your a priori criteria.

Your discussion of the concept of "life" is interesting because I think that is exactly the sort of useful analogy to consciousness. It is not a fact about the world whether a virus is alive or not. Because it isn't a natural kind. We may or may not decide arbitrarily to include or exclude it from the definition but there isn't some underlying inner property of life to be discovered beyond the clusters of known functional, causal, physical properties that we already know it has.

Mephistophilis said...

[split as comment too long!]

Finally I think you're right in your idea that there might be a social "semi-solution". That people's theories and definitions and intuitions will shift as they encounter more mind-like or seemingly intelligent artificial entities. But I think you don't consider going far enough. Classical definitions of consciousness appeal to supposedly universally understood facts and intuitions as to what it means to be conscious. Yet if people's folk theories are so malleable (and I believe they are) it seems the intuitions weren't so decisive after all.

Rather than trying to detect an ineffable property that by most definitions seems to play no causal role and therefore can't function as a criterion for anything, including our own introspective reports, maybe we should ask what we're actually tracking when we make consciousness and moral status attributions. My tentative proposal is bundles of overlapping, continuous, causally active features (persistence, agency, vulnerability, self-modelling, functional valence, memory, etc). These are at least detectable, comparable, and can in principle be studied empirically. They don't require resolving the hard problem. And I think the evidence from folk attributions (including extending moral concern to animals which doesn't sound like it depended on detecting the property of consciousness) suggests these structural features might be what people are actually responding to, rather than the presence or absence of a hidden inner essence.

Anyway thanks for the read and I'll keep an eye out for the published book.

Eric Schwitzgebel said...

Thank so much for these insightful and encouraging comments, M!