The Splintered Mind: Herbie: A Near-Future Debatably Conscious AI Person

Liberals about AI consciousness hold that we might soon (if we haven't already) create genuinely conscious AI systems. Conservatives about AI consciousness hold that AI consciousness remains in the distant future if it's possible at all. According to the Leapfrog Hypothesis, the first conscious AI will not have merely a dim glow of animal-like consciousness, but rich consciousness, similar to a human's. Such an entity would deserve humanlike rights. They would be a person in the ethical sense of the term.

Let's design, in imagination, a technologically feasible near-future AI system to delight the liberals, leapfrogging to personhood. I'll call him Herbie.

[Herbie the Love Bug: image source]

Start with a self-driving car. According to Global Workspace Theory -- perhaps the leading scientific theory of consciousness -- the car will be conscious if high-priority information is globally available to its various computational systems. For example, a representation like "battery almost empty" could be broadcast widely, influencing downstream processing across the vehicle. The navigational system might then search for nearby charging stations, while the acceleration system prioritizes greater energy efficiency, the braking system prioritizes better energy recapture, and a voice system announces the situation to the passengers.

In line with Higher Order Theory, Herbie might also monitor his representations of the road, vehicles, pedestrians, and hazards, assigning some a low probability of correctness. "Pedestrian at location X" might be flagged as only 60% likely to be correct given a history of revised representations of pedestrians in similarly cluttered environments, while "stoplight in 100 meters" might rate over 99% likely. Minor fluctuations in sensors for battery life, cabin temperature, and distance from a lane divider might be ignored as noise, while larger fluctuations -- especially when plausible given other representations (the battery is likelier to gain charge while braking than while accelerating) -- might be treated as accurate signals and permitted to influence downstream processing.

Even if we grant the liberals that this version of Herbie would, or might plausibly be, genuinely conscious, he still falls far short of humanlike consciousness. "Battery almost empty" and "pedestrian at location X" are hardly rich cognitive or perceptual contents. So let's give Herbie the capacity to speak. Fill his trunk with a server running a large language model, connected to the internet and integrated with his global workspace so that high-priority information provides context for language processing, with the language outputs influencing Herbie's other processes. Now people can chat with Herbie as they would with any language model. But unlike today's language models, his speech will be influenced by information about his location, speed, destination, charge, the condition of his parts, the number and location of his passengers, his radio and climate controls, and so on. He can discuss local history, debate whether the music is too loud, and suggest scenic routes.

"Predictive processing" theories in cognitive science emphasize the value of predicting future inputs and registering the difference between received and predicted inputs. When prediction error is large, the system corrects its weights and representations, enabling more accurate predictions in future situations. This is not so different from the reinforcement learning used to train large language models, and it could help Herbie improve his predictions over time. Predictive processing could occur at multiple levels: in fast recurrent loops within sensory systems even when those representations aren't prioritized for global broadcast, and in slower evaluations of globally broadcast, more integrative predictions. Herbie might model himself as an agent producing volatility in his own environment and inputs, at multiple temporal scales. Subroutines in specialized processors might model long chains of what-would-happen-if.

Let's give Herbie some long-term memory. A facial recognition system might identify his passengers, retrieving past interactions, names, previous destinations, and other information relevant to the current interaction. Incidents of high prediction error might also be stored so that Herbie can compare current inputs with past anomalies, improving his learning and attention in situations likely to be unusual or hard to predict. Passengers might also instruct Herbie to store information in long-term memory, such as text, pictures, maps, or records of his own informational states, optionally with instructions about when to retrieve that information how to use it.

Herbie will have some implicitly or explicitly weighted goals. A pedestrian suddenly in his path will trigger braking, overriding lower-priority processes. Avoiding collisions will outweigh conserving energy. Herbie might monitor the condition of his parts and prioritize preventing damage, deploying extra coolant when the engine is dangerously hot and keeping a one-meter margin between himself and adjacent cars. We can enrich his goals, making him more interesting and giving him more to do. He might have the goal of delighting children, leading him to drive around town and tell jokes to kids on the sidewalk. A reinforcement learning algorithm might strengthen connections when his jokes draw a smile, weaken them when reactions are neutral or negative.

Herbie might also have the goal of photographing the city and posting the images on social media, leading him to explore. If social media likes and shares are rewarding, he might learn to prefer certain neighborhoods, views, lighting conditions, and photographic approaches, while avoiding boring repetition. All of this could feed into a global workspace that provides context for his language model, with selective long-term storage and retrieval. Now we can imagine him discussing, with growing sophistication, his approaches to popular photography and to amusing children.

Herbie will then have something functionally similar to emotion: reward processes, an ability to track his progress toward or away from valued goals, and immediate positive or negative responses to new stimuli in light of their influence on his prospects. He will have something functionally similar to introspection: an ability to track and report his own cognitive or representational processes. He will have something functionally similar to a unified sense of self: a sense of his history, the boundaries of his body, his future, his values and priorities. He will have something functionally similar to imagination: a capacity to model hypothetical sequences of events. He will have something functionally similar to complex chains of humanlike linguistic thought.

Maybe Herbie falls in love with his owner or another car of his type. Maybe he develops deep mutual attachments with friends, neighbors, associates, and people he thinks of as family and who think of him the same way. Or to speak more carefully, maybe Herbie shows all the functional and behavioral signs of doing so, while society remains uncertain whether he is genuinely conscious and genuinely experiences the feelings he professes and that his companions attribute to him.

If we allow, with the liberals, that Herbie is or might well be conscious, then it's plausible that his consciousness is not simple but rich and sophisticated. He won't be exactly humanlike, of course. But will he be humanlike enough to count as a person who deserves humanlike rights? For the liberally inclined, it won't be unreasonable, I submit, to think or guess that Herbie is a person. He would then appear to deserve rights such as self-determination, emergency care, and political representation.

If there is some important aspect of humanlike consciousness that I have omitted from my description an AI analog of which is technologically feasible in the near term, stipulate that Herbie also has that feature.

An entity like Herbie would almost certainly invigorate conservatives to articulate and defend views about what he lacks that is necessary for consciousness -- some crucial functional capacity or some biological substrate that can't be replicated in silicon. And they might be entirely right! My point is not that Herbie, or some similar AI system, would actually have richly humanlike consciousness and ethical personhood. Rather, my point is that guessing that he does, and guessing that he does not, would both be reasonable. Herbie, or some alternative near-future AI system, would be a debatable person, about whom people could reasonably starkly disagree.

Ah, but maybe you think consciousness requires an act of God, to instill an immaterial soul? I imagine that a benevolent God would be delighted to give Herbie a soul, thereby making the world richer and better -- for wouldn't it be?

I contend the following: Anyone who claims to know how best to think about Herbie's consciousness or its absence is overconfident. The science of consciousness is too difficult, too methodologically uncertain, and too near its beginnings. All anyone can have -- whether expert or layperson -- is a hunch or inclination, a well-informed guess, but only a guess, not knowledge. Theories of consciousness span a wide spectrum and the methodologies are dubious and often question-begging. Many views can be defended with some plausibility, but precisely for that reason, none can be defended decisively. (For more on this issue, see my forthcoming book, AI and Consciousness, where I present the detailed case for uncertainty.)

8 comments:

James of Seattle said...: I do love how my favorite philosophers know (mostly) how to build conscious AI. See also Frankish’s (and Dennett’s, sorta) Emancipation of the drone https://www.tandfonline.com/doi/full/10.1080/09515089.2025.2612336

I thought all of your development here was spot on until you got to the part on “falling in love”, as if that could just emerge out of the rest. I’m pretty sure that won’t happen unless you give Herbie specific social goals.; Thu Jun 04, 11:03:00 AM PDT
James of Seattle said...: Following up on my comment, I’m wondering if part of Herbie’s construction includes checking on the output of the LLM before executing. I can see the LLM deciding that phrases expressing “falling in love” might be appropriate in a conversation (as today’s LLM’s are wont to do) but would not be appropriate given a lack of social goals.; Thu Jun 04, 11:12:00 AM PDT
Paul D. Van Pelt said...: Enjoyed, immensely James' comments here. I think, and wonder about what "genuine" consciousness might be. For example, I consider myself genuinely conscious, though others could question that. Insofar as AI is a creation from human mind, I can't in good *conscience*, agree that AI is conscious. As machine-learning, developed by human mind, research and action, AI is able to apply inference to problems and suggest solutions. This is propositional attitude as suggested by a well-known philosopher. Dan Dennett's characterization of "sorta" is spot-on...just as his notion of the *wandering two-bitser*. There are coins, nearly the size and weight of a quarter, but nearly is not NEARLY good enough. A Costa Rican twenty-five Colones piece comes close. I liked visiting Costa Rica, until a panhandler threatened me with bodily harm if I did not give him *his* tip, for giving me directions . I guess, as a tourista, that makes me cheap? Cheap goes both ways. And, sideways, tambien. Gracias, y, vaya con Dios.; Fri Jun 05, 05:56:00 AM PDT
Paul D. Van Pelt said...: Rhetorical/Hypothetical Question: Can/could/does AI HAVE social goals? I don't think so. AI is repetition and mimicry---reverse those, if you are a stickler for order above all else. I am only an uncredanialed intellectual. Others have pointed that out.; Fri Jun 05, 06:26:00 AM PDT
Jeanne L. said...: I understand the interest of assessing whether a future (hypothetical) AI has consciousness or not; but maybe another question is of more urgent interest, i.e.: should we take the risk of it happening? In other words, should we develop AI to the extent that the debate of whether it has consciousness or not arises?
Taking maybe some shortcuts, it seems to me that the potential richness it would bring to our world would only be worth it provided that we can honour it (treat this AI as it should be treated), and that it is not detrimental to us. Given the curent state of human rights in the world, I doubt that both conditions could be realised before a long time. But maybe I am pessimistic?; Thu Jun 11, 12:45:00 PM PDT
Eric Schwitzgebel said...: James: Yes, of course, that would probably need to be built in, in some way -- much like with Replika.

Jeanne: I agree! I've defended NOT creating entities like Herbie, under what Mara Garza and I call The Design Policy of the Excluded Middle.; Thu Jun 11, 02:10:00 PM PDT
Paul D. Van Pelt said...: like Jeanne' s comments and your response. Think we are in the same book.; Thu Jun 11, 04:18:00 PM PDT
Paul D. Van Pelt said...: One or two more comments on this post. In my thinking, there is a gulf between sentience and AI. Those who have read ongoing remarks know this.
I further believe the gulf should not be crossed. Whether others share that belief are hesitant to say so is not my affair. I am cautionary, that is all.; Fri Jun 12, 12:20:00 PM PDT

The Splintered Mind

Thursday, June 04, 2026

Herbie: A Near-Future Debatably Conscious AI Person

8 comments:

Recent Comments (may be delayed)

Advice on Applying to PhD Programs in Philosophy

Past Guest Bloggers

Blog Archive