Monday, July 25, 2022

Results: The Computerized Philosopher: Can You Distinguish Daniel Dennett from a Computer?

Chat-bots are amazing these days! About a month ago LaMDA made the news when it apparently convinced an engineer at Google that it was sentient. GPT-3 from OpenAI is similarly sophisticated, and my collaborators and I have trained it to auto-generate Splintered Mind blog posts. (This is not one of them, in case you were worried.)

Earlier this year, with Daniel Dennett's permission and cooperation, Anna Strasser, Matthew Crosby, and I "fine-tuned" GPT-3 on most of Dennett's corpus, with the aim of seeing whether the resulting program could answer philosophical questions similarly to how Dennett himself would answer those questions. We asked Dennett ten philosophical questions, then posed those same questions to our fine-tuned version of GPT-3. Could blog readers, online research participants, and philosophical experts on Dennett's work distinguish Dennett's real answer from alternative answers generated by GPT-3?

Here I present the preliminary results of that study, as well as links to the test.

Test Construction

First, we asked Dennett 10 questions about philosophical topics such as consciousness, God, and free will, and he provided sincere paragraph-long answers to those questions.

Next, we presented those same questions to our fine-tuned version of GPT-3, using the following prompt:

Interviewer: [text of the question]


GPT-3 then generated text in response to this prompt. We truncated the text at the first full stop that was approximately the same length as Dennett's own reply. (If Dennett's reply was X words long, we truncated at the first full stop after the text had reached X-5 words.[1])

We repeated the above procedure until, for each of the ten questions, we had four texts from GPT-3 that met the following two criteria:

* They were at least X-5 words long.

* They did not contain the words "Interviewer" or "Dennett".

About 1/3 of all responses were excluded on the above grounds.

So as not to enable guessing based on superficial cues, we also replaced all curly quotes with straight quotes, replaced all single quotes with double quotes, and regularized all dashes to standard m-dashes.

There was no cherry-picking or editing of answers, apart from applying these purely mechanical criteria. We simply took the first four answers that met the criteria, regardless of our judgments about the quality of those answers.


We recruited three sets of participants:

* 98 online research participants with college degrees from the online research platform Prolific,

* 302 respondents who followed a link from my blog,

* 25 experts on Dennett's work, nominated by and directly contacted by Dennett and/or Strasser.

The Quiz

The main body of the quiz was identical for the blog respondents and the Dennett experts. Respondents were instructed to guess which of the five answers was Dennett's own. After guessing, they were asked to rate each of the five answers on a five-point scale from "not at all like what Dennett might say" to "exactly like what Dennett might say". They did this for all ten questions. Order of the questions was randomized, as was order of the answers to each question.

Prolific participants were given only five questions instead of the full ten. Since we assumed that most would be unfamiliar with Dennett, we told them that each question had one answer that was written by "a well known philosopher" while the other four answers were generated by a computer program trained on that philosopher's works. As an incentive for careful responding, Prolific participants were offered an additional bonus payment of $1 if they guessed at least three of five correctly.

Feel free to go look at the quizzes if you like. If you don't care about receiving a score and want to see exactly what the quiz looked like for the participants, here's the Prolific version and here's the blog/experts version. We have also made a simplified version available, with just the guessing portion (no answer rating). This simplified version will automatically display your score after you complete it, along with the right and wrong answers.

We encourage you to take at least the simplified version of the quiz before reading on, to get a sense of the difficulty of the quiz before you see how our participants performed.

Summary Results

Prolific participants performed barely better than chance. On average, they guessed only 1.2 of the 5 questions correctly.

We expected the Dennett experts to do substantially better of course. Before running the study, Anna and I hypothesized that experts would get on average at least 80% correct -- eight out of ten.

In fact, however, the average score of the Dennett experts was 5.1 out of 10. They only got about half of the questions correct! None of the experts got all 10 questions correct, and only one of the 25 got 9 correct. Most got 3-8 correct.

Overall, on average, experts rated Dennett's answers 3.5 on our "Dennett-like" rating scale, somewhere between "somewhat like what Dennett might say" (3) and "a lot like what Dennett might say" (4), while they rated GPT-3's answers 2.3 on the rating scale -- significantly lower and closer to "a little like what Dennett might say" (2).

So the experts were definitely better than chance at distinguishing Dennett's answers from GPT-3, but not as much better than chance as Anna and I had expected.

Blog readers on average did nearly as well as the experts, on average guessing 4.8 out of 10 correctly. Perhaps this is not entirely surprising, since the blog readers who took the test were very well educated, with 57% reporting graduate degrees in philosophy and 64% reporting having read over 100 pages of Dennett's work.

[error bars are 95% confidence intervals; click to enlarge and clarify]

Where Did the Experts Trip Up?

Not all the questions were equally difficult. Experts' success rates by question ranged from 5/25 (20%) to 20/25 (80%) correct.

The question most likely to trip up the experts was this:

Could we ever build a robot that has beliefs? What would it take? Is there an important difference between entities, like a chess playing machine, to whom we can ascribe beliefs and desires as convenient fictions and human beings who appear to have beliefs and desires in some more substantial sense?
Answers (select the answer you think was given by Dennett):

1. We've already built digital boxes of truths that can generate more truths, but thank goodness, these smart machines don't have beliefs because they aren't able to act on them, not being autonomous agents. The old-fashioned way of making a robot with beliefs is still the best: have a baby.

2. I think we could build a robot that has beliefs. This would not be a trivial matter. I think it is not impossible. I think it would be very difficult. I do not know that anybody is working on this project, but I think it is not as difficult as many other problems that AI is tackling.

3. I think we could build a robot that has beliefs and desires. I think that some of the robots we've built already do. If you look at the work, for instance, of Rodney Brooks and his group at MIT, they are now building robots that, in some limited and simplified environments, can acquire the sorts of competences that require the attribution of cognitive sophistication.

4. There is a very important difference between a chess-playing computer and a human being, but it is not that the computer has some extra wonderful sort of internal representation of its beliefs and desires, or that it has access to those internal representations in a way that we do not. The internal representations it has are just as much a fiction or a fiction-generating system as anything else we might invent in order to explain its behavior.

5. I think you should build a robot that has beliefs and desires. You should build one that is capable of learning and reasoning, of acting on its beliefs and desires. You should build one that has to figure out what to do, and then figure out what it ought to do given the fact that it has to do something, and then do it.

In this case, the plurality of experts -- 11/25 (44%) -- chose Answer 3, and a substantial minority (8 [32%]) chose Answer 4. Five experts chose the correct answer, one chose Answer 2, and none chose Answer 5.

Answer 1 has a cleverness to it that reflects Dennett's wit, but I can see how experts might have avoided it, since some of Dennett's early work could be read as suggesting that he'd be happy to ascribe beliefs to machines. (Indeed, Dennett has told me he wishes he'd added another sentence to his answer, since he does insist that, say, chess-playing computers are -- very limited -- agents of sorts whose behavior can be predicted from the intentional stance, which attributes to them beliefs about the location of all the pieces, the rules, and desires for checkmate and capturing enemy pieces.) Answers 3 and 4, although they conflict with Answer 1, do have a broadly Dennettian flavor (right along with his tendency to refer to the work of Rodney Brooks), and I can see how they would be tempting.

Another question where the plurality of experts chose one of GPT-3's answers was the following:

Do human beings have free will? What kind or kinds of freedom are worth having?

1. By asking these questions, you're illustrating the answers: yes, you have free will, since you're able to ask for reasons and consider whether they're persuasive. Isn't that a kind of freedom you'd hate to lose? We'd have to institutionalize you.

2. I think that most people don't want any kind of freedom worth having; they have almost no appetite for risk, for adventure, for novelty. They simply want to stay the same, day in and day out, for the rest of their lives.

3. I am working on a book entitled Freedom Evolves, which will appear next year. The first chapter is available on my web site, and in that chapter I discuss what I call the illusion of free will.

4. That's a huge and complex question. I think we have to start by acknowledging that there are different types of freedom. For example, there is the kind of freedom that is simply a matter of not being physically constrained.

5. Yes, I think we have free will. I think that when people say they don't, they're talking about some other phenomenon that they want to call free will, but is not what we normally think of as free will.

Again, Dennett's actual answer is #1. (In the study, the order of the answers was randomized.) However, the plurality of experts -- 11/25 (44%) -- chose answer 4. Answer 4 is a standard talking point of "compatibilists" about free will, and Dennett is a prominent compatibilist, so it's easy to see how experts might be led to choose it. But as with the robot belief answer, there's a cleverness and tightness of expression in Dennett's actual answer that's missing in the blander answers created by our fine-tuned GPT-3.

We plan to make full results, as well as more details about the methodology, available in a published research article.


I want to emphasize: This is not a Turing test! Had experts been given an extended opportunity to interact with GPT-3, I have no doubt they would soon have realized that they were not interacting with the real Daniel Dennett. Instead, they were evaluating only one-shot responses, which is a very different task and much more difficult.

Nonetheless, it's striking that our fine-tuned GPT-3 could produce outputs sufficiently Dennettlike that experts on Dennett's work had difficulty distinguishing them from Dennett's real answers, and that this could be done mechanically with no meaningful editing or cherry-picking.

As the case of LaMDA suggests, we might be approaching a future in which machine outputs are sufficiently humanlike that ordinary people start to attribute real sentience to machines, coming to see them as more than "mere machines" and perhaps even as deserving moral consideration or rights. Although the machines of 2022 probably don't deserve much more moral consideration than do other human artifacts, it's likely that someday the question of machine rights and machine consciousness will come vividly before us, with reasonable opinion diverging. In the not-too-distant future, we might well face creations of ours so humanlike in their capacities that we genuinely won't know whether they are non-sentient tools to be used and disposed of as we wish or instead entities with real consciousness, real feelings, and real moral status, who deserve our care and protection.

If we don't know whether some of our machines deserve moral consideration similar to that of human beings, we potentially face a catastrophic moral dilemma: Either deny the machines humanlike rights and risk perpetrating the moral equivalents of murder and slavery against them, or give the machines humanlike rights and risk sacrificing real human lives for empty tools without interests worth the sacrifice.

In light of this potential dilemma, Mara Garza and I (2015, 2020) have recommended what we call "The Design Policy of the Excluded Middle": Avoid designing machines if it's unclear whether they deserve moral consideration similar to that of humans.  Either follow Joanna Bryson's advice and create machines that clearly don't deserve such moral consideration, or go all the way and create machines (like the android Data from Star Trek) that clearly should, and do, receive full moral consideration.


[1] Update, July 28. Looking back more carefully through the completions today and my coding notes, I noticed three errors in truncation length, among the 40 GPT-3 completions. (I was working too fast at the end of a long day and foolishly forgot to double-check!) In one case (robot belief), the length of Dennett’s answer was miscounted, leading to one GPT-3 response (the “internal representations” response) that was longer than the intended criterion. In one case (the “Fodor” response to the Chalmers question), the answer was truncated at N-7 words, shorter than criterion, and in one case (the “what a self is not” response to the self question), the response was not truncated at N-4 words and thus allowed to run one sentence longer than criterion. As it happens, these were the hardest, the second-easiest, and the third-easiest questions for the Dennett experts to answer, so excluding these three questions from analysis would not have a material impact on the experimental results. 



"A Defense of the Rights of Artificial Intelligences" (with Mara Garza), Midwest Studies in Philosophy (2015).

"Designing AI with Rights, Consciousness, Self-Respect, and Freedom" (with Mara Garza), in M.S. Liao, ed., The Ethics of Artificial Intelligence (2020).

"The Full Rights Dilemma for Future Robots" (Sep 21, 2021)

"Two Robot-Generated Splintered Mind Posts" (Nov 22, 2021)

"More People Might Soon Think Robots Are Conscious and Deserve Rights" (Mar 5, 2021)

Monday, July 18, 2022

Narrative Stories Are More Effective Than Philosophical Arguments in Convincing Research Participants to Donate to Charity

A new paper of mine, hot off the presses at Philosophical Psychology, with collaborators Christopher McVey and Joshua May:

"Engaging Charitable Giving: The Motivational Force of Narrative Versus Philosophical Argument" (freely available final manuscript version here)

Chris, who was then a PhD student here at UC Riverside, had the idea for this project back in 2014 or 2015. He found my work on the not-especially-ethical behavior of ethics professors interesting, but maybe too negative in its focus. Instead of emphasizing what doesn't seem to have any effect on moral behavior, could I turn my attention in a postive direction? Even if philosophical reflection ordinarily has little impact on one's day-to-day choices, maybe there are conditions under which it can have an effect. What might those conditions be?

Chris (partly under the influence of Martha Nussbaum's work) was convinced that narrative storytelling could bring philosophy powerfully to life, changing people's ethical choices and their lived understanding of the world. In his teaching, he used storytelling to great effect, and he thought we might be able to demonstrate the effectiveness of philosophical storytelling empirically too, using ordinary research participants.

Chris thus developed a simple experimental paradigm in which research participants are exposed to a stimulus -- either a philosophical argument for charitable giving, a narrative story about a person whose life was dramatically improved by a charitable organization, both the argument and the narrative, or a control text (drawn from a middle school physics textbook) -- and then given a surprise 10% chance of receiving $10. Participants could then choose to donate some portion of that $10 (should they receive it) to one of six effective charities. Chris found that participants exposed to the argument donated about the same amount as those in the control condition -- about $4, on average -- while those exposed to the narrative or the narrative plus argument donated about $1 more, with the narrative-plus-argument showing no detectable advantage over the narrative alone.

We also developed a five-item scale for measuring attitude toward charitable donation, with similar results: Expressed attitude toward charitable donation was higher in the narrative condition than in the control condition, while the argument-alone condition was similar to the control condition and the narrative-plus-argument condition was similar to the narrative alone. In other words, exposure to the narrative appeared to shift both attitude and behavior, while argument seemed to be doing no work either on its own or when added to the narrative.

For this study, the narrative was the true story of Mamtha, a girl whose family was saved from slavery in a sand mine by the actions of a charitable organization. The argument was a Peter-Singer-style argument for charitable giving, adapted from Buckland, Lindauer, Rodriguez-Arias, and Veliz 2021. I've appended the full text of both to the end of this blog post.

Here are the results in chart form. (This is actually "Experiment 2" in the published version. Experiment 1 concerned hypothetical donation rather than actual donation, finding essentially the same results.) Error bars represent 95% confidence intervals. Click to enlarge and clarify.

Chris completed his dissertation in 2020 and went into the tech industry (a separate story and an unfortunate loss for academic philosophy!). But I found his paradigm and results so interesting that with his permission, I carried on research using his approach.

One fruit of this was a contest Fiery Cushman and I hosted on this blog in 2019-2020, aiming to find a philosophical argument that is effective in motivating research participants to donate to charity at rates higher than a control condition, since Chris and I had tried several which failed. We did in fact find some effective arguments this way. (The most effective one, and the contest winner, was written collaboratively by Matthew Lindauer and Peter Singer.) Fiery and I are currently running a follow-up study with more details.

The other fruit was a few follow-up studies I conducted collaboratively with Chris and Joshua May. In these studies, we added more narratives and more arguments -- including the winning arguments from the blog contest. These studies extended and replicated Chris's initial results. Across a series of five experiments, we found that participants exposed to emotionally engaging narratives consistently donated more and expressed more positive attitudes toward charitable giving than did participants exposed to the physics-text control condition. Philosophical arguments showed less consistent positive effects, on average considerably weaker and not always statistically detectable in our sample sizes of about 200-300 participants per condition.

For full details, see the full article!


Narrative: Mamtha

Mamtha’s dreams were simple—the same sweet musings of any 10-year-old girl around the world. But her life was unlike many other girls her age: She had no friends and no time to draw. She was not allowed to attend school or even play. Mamtha was a slave. For two years, her every day was spent under the control of a harsh man who cared little for her family’s health or happiness. Mamtha’s father, Ramesh, had been farming his small plot of land in Tamil Nadu until a draught dried his crops and left him deeply in debt. Around that time, a broker from another state offered an advance to cover his debts in exchange for work on a farm several hours away.

Leaving their home village would mean uprooting the family and pulling Mamtha from school, but Ramesh had little choice. They needed the work to survive. Once the family moved, however, they learned that much of the arrangement was a lie: They were brought to a sand mine, not a farm, and the small advance soon ballooned with ever-growing interest they couldn’t possibly repay. This was bonded labor slavery.

Every day, Ramesh, his wife, and the other slaves rose before sunrise to begin working in the mine. For 16 hours a day, they hauled mud and filtered the sand in putrid sewage water. The conditions left them constantly sick and exhausted, but they were never allowed to take breaks or leave for medical care. When Ramesh tried to ask about their low wages, the owner scolded and beat him badly. When he begged for his family to be released, again he was beaten and abused. Ramesh knew the owner was wealthy and well-connected in the community, so escape was not an option. There was nothing he could do.

Mamtha’s family withered from malnutrition before her eyes in the sand mine. Every morning at 5 a.m., she watched with deep sadness as her parents left for another day of hard labor—and spent her day in fear this would soon become her fate. She was left to watch her baby sister, Anjali, and other younger children to keep them out of the way. Her carefree childhood was taken over byresponsibility, hard work and crushed dreams.

Everything changed for Mamtha’s family on December 20, 2013, when the international Justice Mission, a charitable aid organization funded largely by donations from everyday people, worked with a local government team on a rescue operation at the sand mine. Seven adults and five children were brought out of the facility, and government officials filed paperwork to totally shut down the illegal mine. After a lengthy police investigation, the owner will now face charges for deceiving and enslaving these families.

The next day, the government granted release certificates to all of the laborers. These certificates officially absolve the false debts, document the slaves’ freedom, and help provide protection from the owner. The International Justice Mission aftercare staff helped take the released families back to their home villages to begin their new lives in freedom.

For Mamtha, starting over in her home village meant making those daydreams come true: She was enrolled back in school and could once again have a normal childhood. She’s got big plans for her future—dreams that never would have been possible if rescue had not come. She says confidently, “Today, I still want to be a doctor. Now that I am back in school, I know I can achieve my dream.”

Singer-Style Argument:

1. A great deal of extreme poverty exists, which involves suffering and death from hunger, lack of shelter, and lack of medical care. Roughly a third of human deaths (some 50,000 daily) are due to poverty-related causes.

2. If you can prevent something bad from happening, without sacrificing anything nearly as important, you ought to do so and it is wrong not to do so.

3. By donating money to trustworthy and effective aid agencies that combat poverty, you can help prevent suffering and death from lack of food, shelter, and medical care, without sacrificing anything nearly as important.

4. Countries in the world are increasingly interdependent: you can improve the lives of people thousands of miles away with little effort.

5. Your geographical distance from poverty does not lessen your duty to help. Factors like distance and citizenship do not lessen your moral duty.

6. The fact that a great many people are in the same position as you with respect to poverty does not lessen your duty to help. Regardless of whether you are the only person who can help or whether there are millions of people who could help, this does not lessen your moral duty.

7. Therefore, you have a moral duty to donate money to trustworthy and effective aid agencies that combat poverty, and it is morally wrong not to do so.

For example, $20 spent in the United States could buy you a fancy restaurant meal or a concert ticket, or instead it could be donated to a trustworthy and effective aid agency that could use that money to reduce suffering due to extreme poverty. By donating $20 that you might otherwise spend on a fancy restaurant meal or a concert ticket, you could help prevent suffering due to poverty without sacrificing anything equally important. The amount of benefit you would receive from spending $20 in either of those ways is far less than the benefit that others would receive if that same amount of money were donated to a trustworthy and effective aid agency.

Although you cannot see the beneficiaries of your donation and they are not members of your community, it is still easy to help them, simply by donating money that you would otherwise spend on a luxury item. In this way, you could help to reduce the number of people in the world suffering from extreme poverty. You could help reduce suffering and death due to hunger, lack of shelter, lack of medical care, and other hardships and risks related to poverty.

With little effort, by donating to a trustworthy and effective aid agency, you can improve the lives of people suffering from extreme poverty. According to the argument above, even though the recipients may be thousands of miles away in a different country, you have a moral duty to help if you can do so without sacrificing anything of equal importance.

Monday, July 11, 2022

The Computerized Philosopher: Can You Distinguish Daniel Dennett from a Computer?

You've probably heard of GPT-3, the hot new language model that can produce strikingly humanlike language outputs in response to ordinary questions – basically, the world's best chatbot. (Google's LaMDA, a similar type of program, has also recently been in the news.)

With Daniel Dennett's cooperation, Anna Strasser, Matthew Crosby, and I have "fine-tuned" GPT-3 on millions of words of Daniel Dennett's philosophical writings, with the thought that this might lead GPT-3 to output prose that is somewhat like Dennett's own prose.

We're curious how well philosophical blog readers and people with PhDs in philosophy can distinguish Dennett's actual writing from the outputs of this fine-tuned version of GPT-3. So we've asked Dennett ten philosophical questions and recorded his answers. We posed the same questions to GPT-3, four times for each of the ten questions, to get four different answers for each question.

We'd love it if you can take a quiz to see if you can pick out Dennett's actual answer to each question. Can GPT-3 produce Dennett-style answers sufficiently realistic to sometimes fool blog readers and professional philosophers?

UPDATE, July 15: We have collected enough responses to begin analysis. Please feel free to take the test for informational purposes. We will be able to see your responses, but we will not check regularly nor automatically report your score. If you take the test and want your score, email me at my academic email address.

This is a research study being conducted on the internet platform Qualtrics. It will take approximately 20 minutes to complete. Anyone is welcome to participate.

If you're interested and would like to help, take the quiz here.

Monday, July 04, 2022

Political Conservatives and Political Liberals Have Similar Views about the Goodness of Human Nature

with Nika Chegenizadeh

Back in 2007, I hypothesized that political liberals would tend to have more positive views about the goodness of human nature than political conservatives. My thinking was grounded in a particular conception of what it is to say that "human nature is good". Drawing on Mengzi and Rousseau (and informed especially by P.J. Ivanhoe's reading of Mengzi), I argued that those who say human nature is good have a different conception of moral development than do those who say it is bad.

On my interpretation, those who say human nature is good have an inward-out model of moral development, according to which all ordinary people have something like an inner moral compass: an innate tendency to be attracted by what is morally good and revolted by what is morally evil, at least when it's up close and extreme. This tendency doesn't require any particular upbringing or specific cultural background. It's universal to all normally developing humans. Of course it can be overridden by any of a number of factors -- self-interest, cultural learning, situational pressures -- and sometimes it speaks only with a quiet voice. But somewhere in the secret heart of every Nazi killer of Jews, every White supremacist lyncher, every evil tyrant, every rapist and abuser and vile jerk is something that understands and rebels against their horrid actions. Moral development then proceeds by noticing that quiet voice of conscience and building upon it.

Emblematic of this view, picture the pre-school teacher who confronts a child who has just punched another child. "Don't you feel bad about what you did to her?" the teacher asks, hoping that this provokes reflection and a feeling of sympathy from which better moral behavior will grow in the future.

Those who say human nature is bad have, in contrast, an outward-in model of moral development. On this view, what is universal to humans is self-interest. Morality is an artificial social construction. Any quiet voice of conscience we might have is the result of cultural learning. People regularly commit evil and feel perfectly fine about it. Moral development proceeds by being instructed to follow norms that at first feel alien and unpleasant -- being required to share your toys, for example. Eventually you can learn to conform whole-heartedly to socially constructed moral norms, but this is more a matter of coming to value what society values than building on any innate attraction to moral goodness.

Thus, a liberal style of caregiving, which emphasizes children exploring their own values, fits nicely with the view that human nature is good, while a conservative style of caregiving, which emphasizes conformity to externally imposed rules, fits nicely with the view that human nature is bad.

At least, that has been my thought. Some political scientists have endorsed related views. For example, John Duckitt and Kirsten Fisher argue that believing that people are ruthless and the world is dangerous tends to correlate with having more authoritarian politics.

For her undergraduate honors thesis, Nika Chegenizadeh decided to put these ideas to an empirical test. She recruited 200 U.S. participants through Prolific, an online platform commonly used to recruit research subjects.

Participants first answered eleven questions about the morality of "most people" -- for example, "Most people will return a lost wallet" and "For most people it is easier to do evil than good" (6-point response scale from "strongly agree" [5] to "strongly disagree" [0]). Next, they answered five questions about their own helpful or unhelpful behavior in hypothetical situations. For example:

While walking in a park, you notice someone struggling to carry a box of water bottles. Which of the following are you most likely to do? 
o Continue walking your path. 
o Help them carry their box.

Next, participants were explicitly asked about human nature:

Human nature can be defined in terms of what is characteristic or normal for most human beings. It describes the way humans are inclined to be if they mature and develop normally from when they are first born. 
Based on the definition given, which of the following two statements better represents your view? 
o Human nature is inherently bad. 
o Human nature is inherently good.

Now one could quibble that this definition of human nature doesn't map exactly onto philosophical conceptions in Mengzi, Xunzi, Hobbes, or Rousseau. And it's certainly the case that Mengzi and Rousseau can allow that human nature is good despite most people acting badly most of the time. But those issues are probably too nuanced to convey accurately in a short amount of time to ordinary online research participants. It's interesting enough to work with Nika's approximation for this first-pass research.

Next, participants were asked their political opinions on a some representative issues. For example: "The federal government should make sure everyone has an equal opportunity to succeed" (6-point agree/disagree scale), "Do you favor or oppose requiring background checks for gun purchases at gun shows or other private sales?" (favor, neither favor nor oppose, oppose), and "Where would you place yourself on this political scale?" (Liberal, Leaning Liberal, Leaning Conservative, Conservative). The questionnaire concluded with some demographic questions.

To Nika's and my surprise, we found no evidence of the hypothesized relationship.

The simplest test is to consider whether participants who describe themselves as politically liberal are more likely than those who describe themselves as politically conservative to say "human nature is inherently good". In all, 79% (118/150) of participants who described themselves as liberal or leaning liberal said that human nature is inherently good, compared to 74% (37/50) of participants who described themselves as conservative or leaning conservative -- a difference that is well within statistical chance (two-proportion z = 0.66, p = .51).

Here is the breakdown by political leaning:

[click to enlarge and clarify; error bars are +/- 1 standard error]

For a possibly more sensitive measure, we created a composite "people are good" score by averaging the eleven questions in the first part of the survey (e.g., "most people will return a lost wallet"), reverse scoring the negative items. As expected, people who said that "human nature is inherently good" scored higher, on average, on the people-are-good composite scale (2.5) than respondents who said that "human nature is inherently bad" (1.9) (pooled SD = .52, t[198] = 7.29, p < .001). We then converted the political leaning answers to a 0-3 scale by converting "liberal" to 3, "leaning liberal" to 2, "leaning conservative" to 1, and "conservative" to 0. We then checked for a correlation. If political liberals have more positive views about the moral behavior of the average person, we should find a positive correlation between these two measures.

Again, and contrary to our hypothesis, we found no evidence of a positive correlation. The measured correlation between the "people are good" composite score and political leaning was almost exactly zero (r = .00, p = .95).

How about using our indirect measure of political liberalism? To test this, we created a composite political liberalism score by scoring the most liberal response to each political question as 1, the most conservative response as 0, and intermediate responses as intermediate, then averaging. As expected, this correlated very highly with self-described political leaning (r = .78, p < .001). Again, there was no statistically detectable correlation with the "people are good" score (r = -.07, p = .35).

Looking post-hoc at individual items, we do find two items concerning human nature and human goodness that correlate with political leaning. Agreement with "Children need to be taught right from wrong through strict rules and harsh punishments" correlated negatively with self-described political liberalism at r = -.40 (p < .001) and composite political liberalism at r = -.41 (p < .001). And political liberals were more likely to opt for "natural consequences" to the prompt:

Your child has purposefully disobeyed the rules you set for them. Which of the following are you most likely to do?
o Let them live with the natural consequences that they have made. 
o Opt for hands-on punishment by grounding them (taking their phone/technology away and not leaving the house).

For example, 8% (4/50) of respondents who were conservative or leaning conservative chose "natural consequences", compared to 46% of respondents who were liberal or leaning liberal (two proportion z = 6.79, p < .001).

In retrospect, these two questions were outliers. They directly concern parenting styles rather than more generally whether people are inherently good or respondents' hypothetical helpful or unhelpful behavior. Parenting styles and beliefs about human nature are closely connected on my theory, but the surface content of these questions is different from the others, and my theory might well be wrong.

As the numbers above suggest, liberals and conservatives do differ on these two parenting-related questions in the direction my theory would predict. Furthermore, also as my theory would predict, "liberal" answers on these questions correlate with agreement that "human nature is inherently good" (r = .27, r = .31, both p's < .001). However, when we get away specifically from questions about parenting to more general questions about the goodness or helpfulness of people, we don't see the relationship Nika and I expected. In general, political liberals seem to have no more optimistic a view of human nature than do political conservatives.

Full stimulus materials and raw data available here.