Friday, February 16, 2024

What Types of Argument Convince People to Donate to Charity? Empirical Evidence

Back in 2020, Fiery Cushman and I ran a contest to see if anyone could write a philosophical argument that convinced online research participants to donate a surprise bonus to charity at rates statistically above control. (Chris McVey, Josh May, and I had failed to write any successful arguments in some earlier attempts.) Contributions were not permitted to mention particular real people or events, couldn't be narratives, and couldn't include graphics or vivid descriptions. We wanted to see whether relatively dry philosophical arguments could move people to donate.

We received 90 submissions (mostly from professional philosophers, psychologists, and behavioral economists, but also from other Splintered Mind readers), and we selected 20 that we thought represented a diversity of the most promising arguments. The contest winner was an argument written by Matthew Lindauer and Peter Singer, highlighting that a donation of $25 can save a child in a developing country from going blind due to trachoma, then asking the reader to reflect on how much they would be willing to donate to save their own child from going blind. (Full text here.)

Kirstan Brodie, Jason Nemirow, Fiery, and I decided to follow up by testing all 90 submitted arguments to see what features were present in the most effective arguments. We coded the arguments according to whether, for example, they mentioned children, or appealed to religion, or mentioned the reader's assumed own economic good fortune, etc. -- twenty different features in all. We recruited approximately 9000 participants. Each participant had a 10% chance of winning a surprise bonus of $10. They could either keep the whole $10 or donate some portion of it to one of six effective charities. Participants decided whether to donate, and how much, before knowing if they were among the 10% receiving the $10.

Now, unfortunately, proper statistical analysis is complicated. Because we were working with whatever came in, we couldn't balance argument features, most arguments had multiple coded features, and the coded features tended to correlate between submissions. I'll share a proper analysis of the results later. Today I'll share a simpler analysis. This simple analysis looks at the coded features one by one, comparing the average donation among the set of arguments with the feature to average donation among the set of arguments without the feature.

There is something to be said, I think, for simple analysis even when they aren't perfect: They tend to be easier to understand and to have fewer "researcher degrees of freedom" (and thus less opportunity for p-hacking). Ideally, simple and sophisticated statistical analyses go hand-in-hand, telling a unified story.

So, what argument features appear to be relatively more versus less effective in motivating charitable giving?

Here are our results, from highest to lowest difference in mean donation. "diff" is the dollar difference in mean donation, N is the number of participants who saw an argument with that feature, n is the number of arguments containing that feature, and p is the statistical p-value in a two-sample t test (without correction for multiple comparisons). All analyses are tentative, pending double-checking, skeptical examination, and possibly some remaining data clean-up.

Predictive Argument Features, Highest to Lowest

Does the argument appeal to the notion of equality?
$3.99 vs $3.39 (diff = $.60, N = 395, n = 4, p < .001)

... mention human evolutionary history?
$3.93 vs $3.39 (diff = $.55, N = 4940, n = 5, p < .001)

... specifically mention children?
$3.76 vs $3.26 (diff = $.49, N = 4940, n = 27, p < .001)

... mention a specific, concrete benefit to others that $10 or a similar amount would bring (e.g., 3 mosquito nets or a specific inexpensive medical treatment)?
$3.75 vs $3.44 (diff = $.41, N = 1718, n = 17, p < .001)

... appeal to the diminishing marginal utility of dollars kept by (rich) donors?
$3.69 vs $3.29 (diff = $.40, N = 2843, n = 27, p < .001)

... appeal to the massive marginal utility of dollars transferred to (poor) recipients?
$3.65 vs $3.25 (diff = $.40, N = 3758, n = 36, p < .001)

... mention, or ask the participant to bring to mind, a particular person who is physically or emotionally near to them?
$3.74 vs $3.34 (diff = $.34, N = 318, n = 3, p = .061)

... mention particular needs or hardships such as clean drinking water or blindness?
$3.56 vs $3.23 (diff = $.30, N = 4940, n = 49, p < .001)

... refer to the reader's own assumed economic good fortune?
$3.58 vs $3.31 (diff = $.27, N = 3544, n = 35, p < .001)

... focus on one, single issue? (e.g. trachoma)
$3.61 vs $3.40 (diff = $.21, N = 800, n = 8, p = .07)

... remind people that giving something is better than nothing? (i.e. corrective for drop-in-the-bucket thinking)
$3.56 vs $3.40 (diff = $.15, N = 595, n = 6, p = .24)

... appeal to the views of experts (e.g. philosophers, psychologists)?
$3.47 vs $3.39 (diff = $.07, N = 2629, n = 27, p = .29)

... reference specific external sources such as news reports or empirical studies?
$3.47 vs $3.40 (diff = $.07, N = 1828, n = 18, p = .41)

... explicitly mention that donation is common?
$3.46 vs $3.41 (diff = $.05, N = 736, n = 7, p = .66)

... appeal to the notion of randomness/luck (e.g., nobody chose the country they were born in)?
$3.43 vs $3.41 (diff = $.02, N = 1403, n = 14, p = .80)

... mention religion?
$3.35 vs $3.42 (diff = -$.07, N = 905, n = 9, p = .48)

... appeal to veil-of-ignorance reasoning or other perspective-taking thought experiments?
$3.29 vs $3.23 (diff = -$.14, N = 4940, n = 8, p = .20)

... mention that giving could inspire others to give? (i.e. spark behavioral contagion)
$3.29 vs $3.43 (diff = -$.14, N = 896, n = 9, p = .20)

... explicitly mention and address specific counterarguments?
$3.29 vs $3.45 (diff = -$.15, N = 1829, n = 19, p = .048)

... appeal to the self-interest of the participant?
$3.22 vs $3.49 (diff = -$.30, N = 2604, n = 22, p < .001)

From this analysis, several argument features appear to be effective in increasing participant donations:

  • mentioning children and appealing to the equality of all people,
  • mentioning concrete benefits (one or several),
  • mentioning the reader's assumed economic good fortune and the relatively large impact of a relatively small sacrifice (the "margins" features), and
  • mentioning evolutionary history (e.g., theories that human beings evolved to care more about near others than distant others).
  • Mentioning a particular near person might also have been effective, but since only three arguments were coded in this category, statistical power was poor.

    In contrast, appealing to the participant's self-interest (e.g., that donating will make them feel good) appears to have backfired. Mentioning and addressing counterarguments to donation (e.g., responding to concerns that donations are ineffective or wasted) might also have backfired.

    Now I don't think we should take these results wholly at face value. For example, only five of the ninety arguments appealed to evolutionary history, and all of those arguments included at least two other seemingly effective features: particular hardships, margins, or children. In multiple regression analyses and multi-level analyses that explore how the argument features cluster, it looks like particular hardships, children, and margins might be more robustly predictive -- more on that in a future post. ETA (Feb 19): Where the n < 10 arguments, effects are unlikely to be statistically robust.

    What if we combine argument features? There are various ways to do this, but the simplest is to give an argument one point for any of the ten largest-effect features, then perform a linear regression. The resulting model has an intercept of $3.09 and a slope of $.13. Thus, the model predicts that participants who read arguments with none of these features will donate $3.09, while participants who read a hypothetical argument containing all ten features will donate $4.39.

    Further analysis also suggests that piling up argument features is cumulative: Arguments with at least six of the effective features generated mean donations of $3.89 (vs. $3.37), those with at least seven generated mean donations of $4.46 (vs. $3.38), and the one argument with eight of the ten effective features generated a mean donation of $4.88 (vs. $3.40) (all p's < .001). This eight-feature argument was, in fact, the best performing argument of the ninety. (However, caution is warranted concerning the estimated effect size for any particular argument: With approximately only 100 participants per argument and a standard deviation of about $3, the 95% confidence intervals for the effect size of individual arguments are about +/- $.50.)

    ------------------------------------------------------

    Last month, I articulated and defended the attractiveness of moral expansion through Mengzian extension. On my interpretion of the ancient Chinese philosopher Mengzi, expansion of one's moral perspective often (typically?) begins with noticing how you react to nearby cases -- whether physically nearby (a child in front of you, about to fall into a well) or relationally nearby (your close family members) -- and proceeds by noticing that remote cases (distant children, other people's parents) are similar in important respects.

    None of the twenty coded features captured exactly that. ("Particular near person" was close, but neither necessary nor sufficient: not necessary, because the coders used a stringent standard for when an argument invoked a particular near person, and not sufficient since invoking a particular near person is only the first step in Mengzian extension.) So I asked UCR graduate student Jordan Jackson, who studies Chinese philosophy and with whom I've discussed Mengzian extension, to read all 90 arguments and code them for whether they employed Mengzian extension style reasoning. He found six that did.

    In accord with my hypothesis about the effectiveness of Mengzian extension, the six Mengzian extension arguments outperformed the arguments that did not employ Mengzian extension:

    $3.85 vs $3.38 (diff = $.47, N = 612, n = 6, p < .001)

    Among those six arguments are both the 2020 original contest winner written by Lindauer and Singer and also the best-performing argument in the present study -- though as noted earlier, the best-performing argument in the current study also had many other seemingly effective features.

    In case you're curious, here's the full text of that argument, adapted by Alex Garinther, and quoting extensively, from one of the stimuli in Lindauer et al. 2020

    HEAR ME OUT ON SOMETHING. The explanation below is a bit long, but I promise reading the next few paragraphs will change you.

    As you know, there are many children who live in conditions of severe poverty. As a result, their health, mental development, and even their lives are at risk from lack of safe water, basic health care, and healthy food. These children suffer from malnutrition, unsanitary living conditions, and are susceptible to a variety of diseases. Fortunately, effective aid agencies (like the Against Malaria Foundation) know how to handle these problems; the issue is their resources are limited.

    HERE'S A PHILOSOPHICAL ARGUMENT: Almost all of us think that we should save the life of a child in front of us who is at risk of dying (for example, a child drowning in a shallow pond) if we are able to do so. Most people also agree that all lives are of equal moral worth. The lives of faraway children are no less morally significant than the lives of children close to us, but nearby children exert a more powerful emotional influence. Why?

    SCIENTISTS HAVE A PLAUSIBLE ANSWER: We evolved in small groups in which people helped their neighbors and were suspicious of outsiders, who were often hostile. Today we still have these “Us versus Them” biases, even when outsiders pose no threat to us and could benefit enormously from our help. Our biological history may predispose us to ignore the suffering of faraway people, but we don't have to act that way.

    By taking money that we would otherwise spend on needless luxuries and donating it to an effective aid agency, we can have a big impact. We can provide safe water, basic health care, and healthy food to children living in severe poverty, saving lives and relieving suffering.

    Shouldn't we, then, use at least some of our extra money to help children in severe poverty? By doing so, we can help these children to realize their potential for a full life. Great progress has been made in recent years in addressing the problem of global poverty, but the problem isn't being solved fast enough. Through charitable giving, you can contribute towards more rapid progress in overcoming severe poverty.

    Even a donation $5 can save a life by providing one mosquito net to a child in a malaria-prone area. FIVE DOLLARS could buy us a large cappuccino, and that same amount of money could be used to save a life.

    11 comments:

    Doug Portmore said...

    Convincing someone to donate some portion of a possible $10 win to one of six effective charities is not quite the same thing as convincing someone to donate to charity. Perhaps, many of the participants who declined to donate any portion of this possible $10 win were already giving to charity or had different future plans for giving to charity. In other words, I wonder if you're assuming a sort of utilitarian position that holds that what matters is whether someone chooses on each occasion the option that would do the most good for others rather than adopting a more Kantian position that holds that what matters is instead whether someone adopts helping others as an end and, consequently, dedicates (or has plans to dedicate) a significant portion of their time and resources to helping others. Someone with such an end wouldn't necessarily take every favorable opportunity to do so and so wouldn't necessarily donate any of this possible $10 win to charity.

    Patrick said...

    I think your comment that there was a single argument that had almost all of the top features and that performed extremely well points to a potential flaw with some of your conclusions. Namely, it might be that this highly successful argument performed so well because of only a few of its features but that its success made all its other features (which might be irrelevant or even harmful) look good by association. In particular, it appears that some of the top features appeared in only a small number of arguments, e.g. 4 or 5. With such a small number of arguments, one over-performer can significantly influence the average. I realize you briefly address this in your post, but I think the existence of a single highly successful argument with so many of the top features makes this point more compelling.

    In any case, I think this is a very interesting post. I think it would be interesting to see the results of an experiment involving arguments intentionally written to include the top (or bottom) features you identify here but obviously that would require a substantial amount of work to organize and would have many methodological challenges of its own.

    Eric Schwitzgebel said...

    Thanks for the comments, folks!

    Doug: Yes, of course you are correct that some participants might already be donating enough to charity, and it’s certainly also possible some people were moved by the argument and then donated to a different charity instead. We’re comparing the differences between arguments in their effectiveness in generating donations to one of these charities vs another argument or a control condition, so the arguments are appearing to have a positive effect on donation rates to these charities, in this experimental condition. It’s even consistent with our results that participants who normatively should not donate are in fact doing so. I hope I’m. It assuming utilitarianism, since I’m not a utilitarian!

    Patrick: Yes, that is a substantial methodological concern, which we will address later in a multi-level analysis that treats argument number as a random effect and also by doing factor analysis of the argument features. I think the results generally stand up. On your second point: Yes, that would be a natural follow-up!

    Eric Schwitzgebel said...

    Correction: … hope I’m not assuming utilitarianism…

    Sorry for the typo and clumsy phrasing — pecking at my phone rather than comfortably in my office.

    Doug Portmore said...

    Fair enough. I'm just thinking that if we accept a Kantian position, then what will be important is convincing people to adopt helping others as a significant, continually relevant, life-shaping end such that they end up dedicating significant portions of their time and resources to helping others over the course of their lives. And I worry that what you're measuring -- that is, whether an argument convinces someone to make an insignificant donation at a given moment -- isn't nearly as important, if important at all. Indeed, it's possible that what is effective at convincing someone to make an insignificant donation at a given moment (perhaps, a Singer-type argument) will be ineffective or even backfire when it comes to convincing people to adopt helping others as a significant, continually relevant, life-shaping end. I've noted that several of my students have been convinced by Singer's "Famine, Affluence, and Morality" to give to charity at the moment, but those that I've followed up with didn't sustain that giving after the semester ended. Perhaps, they realized that the Singer-type argument implies that one should give in a way that's unsustainable for most human beings. So, perhaps, you're not assuming utilitarianism but only rejecting the view that a Kantian would have regarding what we should be measuring.

    Arnold said...

    Empirical processing: by Evolution, by AI, by Personage...
    ...Evolution is natural selections, AI is phenomenal representations, Personage is now...

    If Charity is from Purpose, then an Empirical Ethical Standard can only apply to a person in what it is they do in a standard...

    Like, Protecting Children and "Teach Your Children Well"...
    ...apparently Charity Abounds...

    Howie said...

    How about arguments that appealed to Marxism or Marx? I guess that is covered by religion

    Eric Schwitzgebel said...

    Thanks for the continuing comments, folks!

    Howie: No arguments explicitly relied on Marxism.

    Doug: It's the perennial problem of doing what one can with what can realistically be measured. I agree lifetime giving -- or better, lifetime helping others in general -- would be better! The closest I have to that is self-reported percentage of income given to charity among ethics professors, non-ethicist philosophers, and a comparison group of other professors. Summary result: "Non-ethicist philosophers reported having donated the least to charity in 2008. 10% reported having donated nothing, compared to 4% of ethicists and 6% of non-philosophers. Excluding the 0's, non-ethicist philosophers' (log-transformed) mean self-reported donation rate was 2.6%, compared to 3.7% for ethicists and 3.6% for non-philosophers." A bit difficult to interpret! Full paper here:

    https://faculty.ucr.edu/~eschwitz/SchwitzAbs/EthSelfRep.htm



    Howard said...

    I was sarcastic about Marx- still upon reflection he has a point, this one: that people tend to feel that stuff spent on others is a awste, while anything spent on oneself or one's family, neighborhood or country is necessary.
    There is some kind of bias here that probably has a name unknown to me
    The question is, my question: is it restricted to capitalism or Western Countries?
    Any winning pitch has to address this as the main stumbling block

    Arnold said...

    65 years ago for 15 year old's, it was...
    ...one man's joy is another man's sorrow and ecology...

    Howard said...

    Anybody who watches the Karz for kids commercials (I don't watch TV voluntarily) or the infomercials for helping kids with cleft mouths would agree.
    However there is a certain fatigue to this appeal and it might work with some segments of the population more than others- this concern is heightened by the makeup of your sample- I'd say rationalists (readers of this blog) and mothers might be easier targets than football fans and MBA students- people who like philosophy are suckers for rational arguments are they not?
    Given your evolutionary hypothesis we must ask why appeal to children might work- different populations have different reasons, don't they?
    One has to ask how emotional appeal interacts with logical argument.