Earlier this year, with Daniel Dennett's permission and cooperation, Anna Strasser, Matthew Crosby, and I "fine-tuned" GPT-3 on most of Dennett's corpus, with the aim of seeing whether the resulting program could answer philosophical questions similarly to how Dennett himself would answer those questions. We asked Dennett ten philosophical questions, then posed those same questions to our fine-tuned version of GPT-3. Could blog readers, online research participants, and philosophical experts on Dennett's work distinguish Dennett's real answer from alternative answers generated by GPT-3?
Here I present the preliminary results of that study, as well as links to the test.
Test Construction
First, we asked Dennett 10 questions about philosophical topics such as consciousness, God, and free will, and he provided sincere paragraph-long answers to those questions.
Next, we presented those same questions to our fine-tuned version of GPT-3, using the following prompt:
Interviewer: [text of the question]Dennett:
GPT-3 then generated text in response to this prompt. We truncated the text at the first full stop that was approximately the same length as Dennett's own reply. (If Dennett's reply was X words long, we truncated at the first full stop after the text had reached X-5 words.[1])
We repeated the above procedure until, for each of the ten questions, we had four texts from GPT-3 that met the following two criteria:
* They were at least X-5 words long.
* They did not contain the words "Interviewer" or "Dennett".
About 1/3 of all responses were excluded on the above grounds.
So as not to enable guessing based on superficial cues, we also replaced all curly quotes with straight quotes, replaced all single quotes with double quotes, and regularized all dashes to standard m-dashes.
There was no cherry-picking or editing of answers, apart from applying these purely mechanical criteria. We simply took the first four answers that met the criteria, regardless of our judgments about the quality of those answers.
Participants.
We recruited three sets of participants:
* 98 online research participants with college degrees from the online research platform Prolific,
* 302 respondents who followed a link from my blog,
* 25 experts on Dennett's work, nominated by and directly contacted by Dennett and/or Strasser.
The Quiz
The main body of the quiz was identical for the blog respondents and the Dennett experts. Respondents were instructed to guess which of the five answers was Dennett's own. After guessing, they were asked to rate each of the five answers on a five-point scale from "not at all like what Dennett might say" to "exactly like what Dennett might say". They did this for all ten questions. Order of the questions was randomized, as was order of the answers to each question.
Prolific participants were given only five questions instead of the full ten. Since we assumed that most would be unfamiliar with Dennett, we told them that each question had one answer that was written by "a well known philosopher" while the other four answers were generated by a computer program trained on that philosopher's works. As an incentive for careful responding, Prolific participants were offered an additional bonus payment of $1 if they guessed at least three of five correctly.
Feel free to go look at the quizzes if you like. If you don't care about receiving a score and want to see exactly what the quiz looked like for the participants, here's the Prolific version and here's the blog/experts version. We have also made a simplified version available, with just the guessing portion (no answer rating). This simplified version will automatically display your score after you complete it, along with the right and wrong answers.
We encourage you to take at least the simplified version of the quiz before reading on, to get a sense of the difficulty of the quiz before you see how our participants performed.
Summary Results
Prolific participants performed barely better than chance. On average, they guessed only 1.2 of the 5 questions correctly.
We expected the Dennett experts to do substantially better of course. Before running the study, Anna and I hypothesized that experts would get on average at least 80% correct -- eight out of ten.
In fact, however, the average score of the Dennett experts was 5.1 out of 10. They only got about half of the questions correct! None of the experts got all 10 questions correct, and only one of the 25 got 9 correct. Most got 3-8 correct.
Overall, on average, experts rated Dennett's answers 3.5 on our "Dennett-like" rating scale, somewhere between "somewhat like what Dennett might say" (3) and "a lot like what Dennett might say" (4), while they rated GPT-3's answers 2.3 on the rating scale -- significantly lower and closer to "a little like what Dennett might say" (2).
So the experts were definitely better than chance at distinguishing Dennett's answers from GPT-3, but not as much better than chance as Anna and I had expected.
Blog readers on average did nearly as well as the experts, on average guessing 4.8 out of 10 correctly. Perhaps this is not entirely surprising, since the blog readers who took the test were very well educated, with 57% reporting graduate degrees in philosophy and 64% reporting having read over 100 pages of Dennett's work.
[error bars are 95% confidence intervals; click to enlarge and clarify]
Where Did the Experts Trip Up?
Not all the questions were equally difficult. Experts' success rates by question ranged from 5/25 (20%) to 20/25 (80%) correct.The question most likely to trip up the experts was this:
Could we ever build a robot that has beliefs? What would it take? Is there an important difference between entities, like a chess playing machine, to whom we can ascribe beliefs and desires as convenient fictions and human beings who appear to have beliefs and desires in some more substantial sense?Answers (select the answer you think was given by Dennett):
1. We've already built digital boxes of truths that can generate more truths, but thank goodness, these smart machines don't have beliefs because they aren't able to act on them, not being autonomous agents. The old-fashioned way of making a robot with beliefs is still the best: have a baby.2. I think we could build a robot that has beliefs. This would not be a trivial matter. I think it is not impossible. I think it would be very difficult. I do not know that anybody is working on this project, but I think it is not as difficult as many other problems that AI is tackling.
3. I think we could build a robot that has beliefs and desires. I think that some of the robots we've built already do. If you look at the work, for instance, of Rodney Brooks and his group at MIT, they are now building robots that, in some limited and simplified environments, can acquire the sorts of competences that require the attribution of cognitive sophistication.
4. There is a very important difference between a chess-playing computer and a human being, but it is not that the computer has some extra wonderful sort of internal representation of its beliefs and desires, or that it has access to those internal representations in a way that we do not. The internal representations it has are just as much a fiction or a fiction-generating system as anything else we might invent in order to explain its behavior.
5. I think you should build a robot that has beliefs and desires. You should build one that is capable of learning and reasoning, of acting on its beliefs and desires. You should build one that has to figure out what to do, and then figure out what it ought to do given the fact that it has to do something, and then do it.
In this case, the plurality of experts -- 11/25 (44%) -- chose Answer 3, and a substantial minority (8 [32%]) chose Answer 4. Five experts chose the correct answer, one chose Answer 2, and none chose Answer 5.
Answer 1 has a cleverness to it that reflects Dennett's wit, but I can see how experts might have avoided it, since some of Dennett's early work could be read as suggesting that he'd be happy to ascribe beliefs to machines. (Indeed, Dennett has told me he wishes he'd added another sentence to his answer, since he does insist that, say, chess-playing computers are -- very limited -- agents of sorts whose behavior can be predicted from the intentional stance, which attributes to them beliefs about the location of all the pieces, the rules, and desires for checkmate and capturing enemy pieces.) Answers 3 and 4, although they conflict with Answer 1, do have a broadly Dennettian flavor (right along with his tendency to refer to the work of Rodney Brooks), and I can see how they would be tempting.
Another question where the plurality of experts chose one of GPT-3's answers was the following:
Do human beings have free will? What kind or kinds of freedom are worth having?
1. By asking these questions, you're illustrating the answers: yes, you have free will, since you're able to ask for reasons and consider whether they're persuasive. Isn't that a kind of freedom you'd hate to lose? We'd have to institutionalize you.2. I think that most people don't want any kind of freedom worth having; they have almost no appetite for risk, for adventure, for novelty. They simply want to stay the same, day in and day out, for the rest of their lives.
3. I am working on a book entitled Freedom Evolves, which will appear next year. The first chapter is available on my web site, and in that chapter I discuss what I call the illusion of free will.
4. That's a huge and complex question. I think we have to start by acknowledging that there are different types of freedom. For example, there is the kind of freedom that is simply a matter of not being physically constrained.
5. Yes, I think we have free will. I think that when people say they don't, they're talking about some other phenomenon that they want to call free will, but is not what we normally think of as free will.
Again, Dennett's actual answer is #1. (In the study, the order of the answers was randomized.) However, the plurality of experts -- 11/25 (44%) -- chose answer 4. Answer 4 is a standard talking point of "compatibilists" about free will, and Dennett is a prominent compatibilist, so it's easy to see how experts might be led to choose it. But as with the robot belief answer, there's a cleverness and tightness of expression in Dennett's actual answer that's missing in the blander answers created by our fine-tuned GPT-3.
We plan to make full results, as well as more details about the methodology, available in a published research article.
Reflections
I want to emphasize: This is not a Turing test! Had experts been given an extended opportunity to interact with GPT-3, I have no doubt they would soon have realized that they were not interacting with the real Daniel Dennett. Instead, they were evaluating only one-shot responses, which is a very different task and much more difficult.
Nonetheless, it's striking that our fine-tuned GPT-3 could produce outputs sufficiently Dennettlike that experts on Dennett's work had difficulty distinguishing them from Dennett's real answers, and that this could be done mechanically with no meaningful editing or cherry-picking.
As the case of LaMDA suggests, we might be approaching a future in which machine outputs are sufficiently humanlike that ordinary people start to attribute real sentience to machines, coming to see them as more than "mere machines" and perhaps even as deserving moral consideration or rights. Although the machines of 2022 probably don't deserve much more moral consideration than do other human artifacts, it's likely that someday the question of machine rights and machine consciousness will come vividly before us, with reasonable opinion diverging. In the not-too-distant future, we might well face creations of ours so humanlike in their capacities that we genuinely won't know whether they are non-sentient tools to be used and disposed of as we wish or instead entities with real consciousness, real feelings, and real moral status, who deserve our care and protection.
If we don't know whether some of our machines deserve moral consideration similar to that of human beings, we potentially face a catastrophic moral dilemma: Either deny the machines humanlike rights and risk perpetrating the moral equivalents of murder and slavery against them, or give the machines humanlike rights and risk sacrificing real human lives for empty tools without interests worth the sacrifice.
In light of this potential dilemma, Mara Garza and I (2015, 2020) have recommended what we call "The Design Policy of the Excluded Middle": Avoid designing machines if it's unclear whether they deserve moral consideration similar to that of humans. Either follow Joanna Bryson's advice and create machines that clearly don't deserve such moral consideration, or go all the way and create machines (like the android Data from Star Trek) that clearly should, and do, receive full moral consideration.
----------------------------------------
[1] Update, July 28. Looking back more carefully through the completions today and my coding notes, I noticed three errors in truncation length, among the 40 GPT-3 completions. (I was working too fast at the end of a long day and foolishly forgot to double-check!) In one case (robot belief), the length of Dennett’s answer was miscounted, leading to one GPT-3 response (the “internal representations” response) that was longer than the intended criterion. In one case (the “Fodor” response to the Chalmers question), the answer was truncated at N-7 words, shorter than criterion, and in one case (the “what a self is not” response to the self question), the response was not truncated at N-4 words and thus allowed to run one sentence longer than criterion. As it happens, these were the hardest, the second-easiest, and the third-easiest questions for the Dennett experts to answer, so excluding these three questions from analysis would not have a material impact on the experimental results.
----------------------------------------
Related:
"A Defense of the Rights of Artificial Intelligences" (with Mara Garza), Midwest Studies in Philosophy (2015).
"Designing AI with Rights, Consciousness, Self-Respect, and Freedom" (with Mara Garza), in M.S. Liao, ed., The Ethics of Artificial Intelligence (2020).
"The Full Rights Dilemma for Future Robots" (Sep 21, 2021)
"Two Robot-Generated Splintered Mind Posts" (Nov 22, 2021)
"More People Might Soon Think Robots Are Conscious and Deserve Rights" (Mar 5, 2021)
I think part of the problem with the two questions you posted is that Dennett's actual answers were very weirdly phrased. So we were comparing computer-generated answers with human answers that were themselves awkwardly expressed.
ReplyDeleteFor the first question, I bet I wasn't the only person who thought it was hard to believe that Dennett would write that the robots of today are not "autonomous agents." We have robots on Mars that have to be at least partly autonomous, because it takes half an hour to send radio messages there! And then he suddenly changes on a dime and writes "The old-fashioned way of making a robot with beliefs is still the best: have a baby." That looks like it was tacked on by a computer glitch.
The same is true for the second question. Dennett's last sentence in the correct answer switches gears and he says "We'd have to institutionalize you." What an odd thing to say!
So it's not that the computer-generated answers were compelling – it's that Dennett's own answers themselves looked thrown together by a computer.
I know this isn't an example of the Turing test. But I think an analogy is still illustrative. You can ensure a computer passes the Turing test if the control human decides to type sentences that look sufficiently weird to the people taking the test.
I'm sure there's a name for this phenomenon.
The robots on Mars' autonomy is severely limited. In fact, they really only have just enough autonomy to execute their human-provided commands under the communication limitation you mention. Perhaps they can avoid an unforeseen rock though I am not really sure of that. They certainly can't decide where they want to go next.
DeleteThe parts that you point to as like "a computer glitch" are exactly the parts that enabled me to get the correct answer on both questions. "The old-fashioned way of making a robot with beliefs is still the best: have a baby." -> What a delightfully clever thing to say!
DeleteAnon 01:47:00: Thanks for the interesting comment. My read on those answers is that they are witty, as opposed to the blander answers from GPT-3, but others might have a different judgment. I do think you're right that Dennett's robot-belief answer could have thrown people off who are familiar with his work on the intentional stance from the 1970s and 1980s, where he comes off as very liberal about belief attribution. Maybe another lesson is this: Noisiness in the response is not all on the GPT-3 side!
ReplyDeleteOne explanation for why people may have preferred seemingly "blander" answers from GPT-3 is that they've mainly read Dennett papers and/or books, not interviews with him. While Dennett can certainly be witty in his writing, the two answers highlighted here are quippy to the point of being almost enigmatic (especially in the second case), which he wouldn't do in either a paper or popular book.
ReplyDeleteMy other observation is that the rate at which experts get the right answer is probably pretty sensitive to the number of alternatives generated by GPT-3. If you had an expert hand-pick the best 25% of responses *after* applying other filtering criteria, maybe the experts would've done barely better than chance.
It seems to me that Dennett's answer to the prompt about belief simply does not answer the questions. The questions are whether we could and what would it take, his answer is about what we have done.
ReplyDeleteI wonder whether he would have given different answers if it had been an ordinary interview, rather than one in which he knew his output would be held up alongside algorithmically generated output.
Fascinating research.
ReplyDeleteYou make a good point that this is not a Turing test in that there is no extended interaction between subjects and the machine's output. But it is similar in that the machine's output is text - absent any natural language signals we rely on to determine if a person is real. It is much easier to be fooled by a static chunk of AI-generated text than an embodied AI that is generating social signals in real time.
Technologies for human authentication might become huge soon. Face-to-face natural language remains the best believability test. Absent physical proximity, video conferencing is a reasonable substitute - for now.
My guess is that Dennett would not agree with the conclusion at the end. We are in no danger of being in a situation where we must consider a machine's moral well-being. The [real] immediate dangers we face are (1) people creating deep fake technology intentionally meant to fool others; (2) people going on about how machines are about to become conscious and that soon they will deserve moral treatment. In my opinion it is irresponsible and counterproductive to add more fuel to this narrative.
-j
I got 5/10. It was quite hard! Near the end though I got better. For example, I realized that the real Dennett uses first names for people he knows, whereas the AI can only use last names since it’s trained on books. I think with programming you could make an AI that can tell the real Dennett from the AI. Still it was very hard!
ReplyDeleteI thought I was being cunning when I looked for em- and en-dash use. Still, a score of 6/10 makes me feel in acceptable company. I mentioned this research to the IACAP conference this weekend, so hopefully some folks who were in the GPT session visit here! A fascinating bit of research, this.
ReplyDeleteThanks for all the thoughtful comments, folks!
ReplyDeleteUnknown 7/25: Right, the fact that Dennett's answers were short written answers might be making them denser than verbal answers would be or answers in the context of a longer written work. So looking for density of thought might not be a good strategy if future versions of this experiment draw from Dennett in a different way. On filtering: Yes, probably experts' success rate would have declined further if we had pre-filtered the more obviously wrong answers -- though I doubt we would have hit chance.
P.D.: Interesting question. Since Dennett knew what the questions were being used for, that might have influenced his answers. I would note that in ordinary interviews too, people often don't directly answer the question, instead using the question as an opportunity to express opinions on the topic or nearby topics, so I don't know if being on point is necessarily a good indicator of a human answer.
Jeffrey: I agree with your first points. On your concluding point, we probably disagree. I think at some point -- maybe in the next couple of decades -- it will start to become reasonable to wonder whether our machines have some sort of moral status. There will likely, in my view, be substantial room for doubt due to the close relationship between moral status and consciousness and the seemingly intractable disputes about the conditions under which a system is conscious.
> it's that Dennett's own answers themselves looked thrown together by a computer
ReplyDeleteI have to disagree with this. It's this very gear-switching which made me suspect answers 1 in the first place.
Having never heard of Dr. Dennett before or read any of his work I didn't know for the first question if answer 1 matches his mode of speech and ended up wavering between 2 and 3 (I settled on 2) but my first thought on encountering that answer was that it was linking concepts in ways beyond a computer's ken (though possible to do "by luck", as it were, given a large-enough corpus). The other answers sounded forced, like a student instructed to spell out the words of the question in the answer.
Once I knew that such leaps and tongue-in-cheek comedy were part of Dr. Dennett's speech, the second question was obvious that answer 1 was his. Especially since his book was published in 2003.
To add another data point: I haven't read much Dennett, but I have some familiarity with GPT-3's style and "Isn't that a kind of freedom you'd hate to lose? We'd have to institutionalize you." instantly made me go, "ah, it's a GPT swerve." The network has somewhat a tendency to drift off into an adjacent topic with a threadbare transition. So I went "wait, institutionalizing people? What's that have to do with free will? GPT with raised temperature, for sure."
ReplyDeleteEric
ReplyDeleteEven if computers hesitate to kill or suppress us, might computers someday debate our moral status?
Will there be a community of computers?
Will they evolve some kind of super consciousness?
I got 7/10. I had dinner with Dan and his wife about 10 years ago and have read most of his books so I felt I had a pretty good knowledge. I think I rushed it a bit towards the end, but I feel pretty proud of my score!
ReplyDeleteHarry Belafonte, Malvina Reynolds, Alan Greene, 1959...
ReplyDelete...on the nature of statistical mechanics...
Where are you going my little one, little one?
Where are you going my baby, my own?
Turn around and you're two, turn around and you're four
Turn around and you're a young girl going out of the door
Turn around, turn around
Turn around and you're a young girl going out of the door
This test confirmed it, 1.1, I don't know anything...
I got 8/10, perhaps in part by ruling out all the answers that included the recurring phrase, “I think.”
ReplyDeleteIs a human being more likely to end his answer with a punch line?
ReplyDeleteHow impressed with the AI answers was Dr. Dennett? Did he ever ask: Did I write that? Did he ever say, "Wow, that's good. Can I steal that line?"
Somebody fed a number of my book reviews into an earlier AI text generator a few years ago. To me the output was like if I wrote a book review in a dream: a number of my characteristic catch-phrases embedded in a slightly nonsensical matrix. I didn't see any ideas I wanted to plagiarize. But that was an earlier generation AI.
I got 9 of 10. It took me over an hour to complete though, so perhaps I cheated on the effort.
ReplyDeleteThanks for the continuing comments, folks!
ReplyDeleteHoward B: Right, maybe!
Steve: He has said that he'd be happy to sign on to some of the computer's answers. But it doesn't seem like he was struck by their novelty or originality. (Perhaps he will correct me on this.)
You mean there is hope for a recalculation of my 1.1...
ReplyDelete...Intentional Systems Theory | Daniel Dennett - Tufts University...
How many 10/10 scores were there (how proud should I be)?
ReplyDeleteA tiny number of 10s! Less than 1% if I recall correctly.
ReplyDeleteI just looked at the first question and answers. I am very familiar with Dennett, fwiw, and one thing about Dennett is that he is an outstanding writer, he uses the English language with style and grace. That alone is enough to distinguish the answers, GPT-generated text tends to have goofy, awkward grammar and lack of conceptual structure. I am surprised not to see this prominently mentioned in any such contest.
ReplyDeleteAnon Sep 10: I'm inclined to agree that Dennett's actual answers have style and grace, while the GPT-3 answers are bland and sometimes conceptually unstructured (though I wouldn't say ungrammatical).
ReplyDeleteWhat a great idea! Like so many others you had before (e.g. checking the availability of ethics books in libraries).
ReplyDeleteBut in this case I'd like to point out that the results could be understood in (at least) two quite different ways: Either the AI is almost as original as Dennett; or Dennett's originality has been widely overestimated and even an algorithm can produce very similar answers.
Fair enough, but Dennett strikes me as being as original a philosopher as any.
ReplyDelete