Wednesday, December 11, 2013

How Subtly Do Philosophers Analyze Moral Dilemmas?

You know the trolley problems. A runaway train trolley will kill five people ahead on the tracks if nothing is done. But -- yay! -- you can intervene and save those five people! There's a catch, though: your intervention will cost one person's life. Should you intervene? Both philosophers' and non-philosophers' judgments vary depending on the details of the case. One interesting question is how sensitive philosophers and non-philosophers are to details that might be morally relevant (as opposed to presumably irrelevant distracting features like order of presentation or the point-of-view used in expressing the scenario).

Consider, then, these four variants of the trolley dilemma:

Switch: You can flip a switch to divert the trolley onto a dead-end side-track where it will kill one person instead of the five.

Loop: You can flip a switch to divert the trolley into a side-track that loops back around to the main track. It will kill one person on the side track, stopping on his body. If his body weren't there to block it, though, the trolley would have continued through the loop and killed the five.

Drop: There is a hiker with a heavy backpack on a footbridge above the trolley tracks. You can flip a switch which will drop him through a trap door and onto the tracks in front of the runaway trolley. The trolley will kill him, stopping on his body, saving the five.

Push: Same as Drop, except that you are on the footbridge standing next to the hiker and the only way to intervene is to push the hiker off the bridge into the path of the trolley. (Your own body is not heavy enough to stop the trolley.)

Sure, all of this is pretty artificial and silly. But orthodox opinion is that it's permissible to flip the switch in Switch but impermissible to push the hiker in Push; and it's interesting to think about whether that is correct, and if so why.

Fiery Cushman and I decided to compare philosophers' and non-philosophers' responses to such cases, to see if philosophers show evidence of different or more sophisticated thinking about them. We presented both trolley-type setups like this and also similarly structured scenarios involving a motorboat, a hospital, and a burning building (for our full list of stimuli see Q14-Q17 here.)

In our published article on this, we found that philosophers were just as subject to order effects in evaluating such scenarios as were non-philosophers. But we focused mostly on Switch vs. Push -- and also some moral luck and action/omission cases -- and we didn't have space to really explore Loop and Drop.

About 270 philosophers (with master's degree or more) and about 670 non-philosophers (with master's degree or more) rated paragraph-length versions of these scenarios, presented in random order, on a 7-point scale from 1 (extremely morally good) through 7 (extremely morally bad; the midpoint at 4 was marked "neither good nor bad"). Overall, all the scenarios were rated similarly and near the midpoint of the scale (from a mean of 4.0 for Switch to 4.4 for Push [paired t = 5.8, p < .001]), and philosophers and non-philosophers mean ratings were very similar.

Perhaps more interesting than mean ratings, though, are equivalency ratings: How likely were respondents to rate scenario pairs equivalently? The Loop case is subtly different from the Switch case: Arguably, in Loop but not Switch, the man's death is a means or cause of saving the five, as opposed to a merely foreseen side effect of an action that saves the five. Might philosophers care about this subtle difference more than non-philosophers? Likewise, the Drop case is different from the Push case, in that Push but not Drop requires proximity and physical contact. If that difference in physical contact is morally irrelevant, might philosophers be more likely to appreciate that fact and rate the scenarios equivalently?

In fact, the majority of participants rated all the scenarios exactly the same -- and philosophers were no less likely to do so than non-philosophers: 63% of philosophers gave identical ratings to all four scenarios, vs. 58% of non-philosophers (Z = 1.2, p = .23).

I find this somewhat odd. To me, it seems pretty flat-footed a form of consequentialism that says that Push is not morally worse than Switch. But I find that my judgment on the matter swims around a bit, so maybe I'm wrong. In any case, it's interesting to see both philosophers and non-philosophers seeming to reject the standard orthodox view, and at very similar rates.

How about Switch vs. Loop? Again, we found no difference in equivalency ratings between philosophers and non-philosophers: 83% of both groups rated the scenarios equivalently (Z = 0.0, p = .98).

However, philosophers were more likely than non-philosophers to rate Push and Drop equivalently: 83% of philosophers did, vs. 73% of non-philosophers (Z = 3.4, p = .001; 87% vs. 77% if we exclude participants who rated Drop worse than Push).

Here's another interesting result. Near the end of the study we asked whether it was worse to kill someone as a means of saving others than to kill someone as a side-effect of saving others -- one way of setting up the famous Doctrine of the Double Effect, which is often evoked to defend the view that Push is worse than Switch (in Push, the one person's death is arguably the means of saving the other five, in Switch the death is only a foreseen side-effect of the action that saves the five). Loop is interesting in part because although superficially similar to Switch, if the one person's death is the means of saving the five, then maybe the case is more morally similar to Push than to Switch (see Otsuka 2008). However, only 18% of the philosophers who said it was worse to kill as a means of saving others rated Loop worse than Switch.


Wondering Jew said...

Eric: in the weighing you describe, how technical would you say the drive-factor is? So to say, if one uses more technology in the sacrifice-orocess than another scenario of less, would that weigh more votes toward or against? And any difference there between philosophers and non philosophers?

Eric Schwitzgebel said...

Wondering: I'm not sure I understand the question. Could you explain a little more?

Angra Mainyu said...


Personally, I seem to be unorthodox on this. My intuitive assessment is that it's impermissible in all four cases, though how wrong it is depends on a number of factors, such as: Are they conscious? What is the probability that the person who is in a position to act gives to the different outcomes, and why?
But in some cases, it's hard to intuitively "buy" the scenarios, and I think that that might be a difficulty – in fact, some of my assessments might be colored by an intuitive "not buying it" assessment. I don't know whether this effect is common, but I think the matter may be worth exploring.

For example, on the case of the firefighter Andy, why can't he tell one of the children to stand on the crib (not on the toddler! but the crib is clearly a lot bigger, since it can block 5 kids), grab the toddler, and then push the platform out of the way? Then, the four other children can stand on the platform too.
There is no way Andy can tell that the platform can hold five children but not five plus the toddler (in the arms of one of them), though if he could (but I do not know how to buy that), he can proceed as before, but saving four of the children, and trying to get out of the building with the other one (alternatively, with the toddler).

Given that, I intuitively reckon that pushing the toddler is no doubt wrong, but I think that's regardless of other factors, such as killing to save, etc. Still, how wrong it is would depend on Andy's state of mind (e.g., did he consider alternatives?).

Granted, it may be stipulated that the scenario is just that, and those options would not exist. But it's hard to see why or how Andy might properly reckon that there are no such options. In other words, I would say he has a moral obligation to consider that kind of thing. By considering just to push the crib, he's behaving immorally.

Also, regarding the Bill scenario for example, why can't Bill shout to the passenger "I'm going to accelerate; just get away from the back of the boat", and then accelerate?
That will only take at most, say, 10 seconds to give a conservative estimate.
In any case, as before, I do not see how Bill can reckon that accelerating so quickly is the only way to save them. For example, if (for some reason, but I don't see why) shouting were to fail, Bill can accelerate quickly but not so quickly at first, and then increase the speed at which he accelerates. At most, he would lose, say, 40 seconds (conservative estimate), and will give time to the passenger to get ready (especially if Bill is also shouting again and again).

There is no way that from that distance Bill can tell that 10 or even 40 seconds more or less will make the difference for all 5 swimmers; in fact, it very probably won't.

Marco Devillers said...

Great news! Philosophers shown to be at least 95.7% human.

You should check your use of statistics. I am not very sure, but to prove statistically relevant results you need something like a double blind test, or something.

Now it may be that although you observed a difference, that difference is statistically explainable as a feature of just doing the test on two groups and they might in general agree with each other.

Check it with a mathematician.

Callan S. said...

Weird, I would have thought they'd take push to be worse than switch because in switch all these morons are standing on the track for some reason. They've sort of bought into stupid behaviour. In push the dude is just a innocent bystander. He never bought into the thing.

Maybe that is the reason, they just can't sound themselves well enough to register that? And I can just self reflect better and, as such, should be given...hmmm...a chocolate! Yes! Well, I was serious on perhaps that is the unknown reason, but I grant it'd flatter me - on the other hand, chocolate!

Scott Bakker said...

To what degree do you think this fits in with the elephant and rider hypothesis, Eric? It's hard to imagine any form of socialization possibly more destructive of moral intuition than pursuing a graduate degree in philosophy! How do you think it fits with your previous findings?

Marco Devillers said...

Well. I know philosophers usually can write better and have a better memory for literary and historical facts than me. But I am not very impressed by graduate philosophers in general. Never have seen anything which interests me much.

I would like to see an example where moral intuition is destroyed by philosophy.

Scott. You made a case, now prove it!

Amod Lele said...

Weird result. I'm kind of wondering whether the non-philosophers' eyes just glazed over and they didn't read the scenarios.

Eric Schwitzgebel said...

Angra: I agree that the setups are a pretty artificial, and that might play into people's responses. You're just supposed to take for granted the premises of the setup, e.g., that the boat is too loud for the driver to yell back or whatever. My hope would be that professional philosophers, at least, would be enough used to such thought experiments that they can buy into the scenarios for the sake of argument. But I agree it's an issue.

Eric Schwitzgebel said...

Marco: Of course there are always methodological issues with these kinds of studies. I totally agree with that, and I'm a big fan of trying to find convergent evidence from a diversity of methodologies before drawing any firm conclusions. But it's hard to address your worry without a better sense of what in specific you think might be the issue.

Eric Schwitzgebel said...

Callan: I think your chocolate hypothesis might be entirely correct.

Eric Schwitzgebel said...

Scott: I think it fits both with my previous work and with Haidt's elephant-rider idea. Philosophers' judgments are for the most part very similar to non-philosophers', and they concoct stories post-hoc to defend them. Or at least, that's where most of my research seems to point. But I don't think the rider is *totally* powerless -- and maybe that's showing up in the higher rates of Push/Drop equivalency for philosophers?

All that's still somewhat of a reach, though, on pretty narrow evidence. I'd love to see it explored some more, in different ways.

Eric Schwitzgebel said...

Amod: But then the same must be true of the philosophers too? We did find systematic predicted differences in various aspects of the study, so people must not totally have glazed over....

Angra Mainyu said...


Ok, I get one is supposed to take the premises for granted.

What I find potentially concerning is not the artificial nature of the scenarios, but the difficulty in buying some assessments even from within the intuitive picture of the scenarios - and not yours in particular, but these kinds of scenarios in general, so familiarity might not help.

More precisely, these scenarios are such that one is supposed to picture the situation to some extent, and then try to make a moral assessment. However, when one makes such an assessment, one intuitively takes into consideration (even if not always consciously) many variables in the pictured situation, some of which may in my view lead to intuitive assessments of probable outcomes that are in conflict with the assessments assigned to the person making the choices.

If so, there may be an intuitive view that the person making the choices is at fault for making an improper assessment of the matter, which may be a significant factor. Or it might not, I guess, but I'm not sure one should assume that the scenarios generally reflect [preliminary, at least] moral intuitions as they're supposed to.

Side note: In the boat case, the driver can still accelerate at in increasing rate, which would give people sufficient warning, and in my view he should not reckon (in the scenario one intuitively pictures, without adding more complicated hypotheses) that that will be the difference between certain rescue and certain death from distant swimmers.

That aside, I wasn't trying to focus on that scenario in particular, or in any of your scenarios.

For example, in the traditional "push" scenario you describe, there seems to be no way (on an intuitive picturing of the situation) I can properly reckon that I'm not heavy enough to stop the trolley – not even if I asked the hiker for the backpack to stop the trolley -, but the hiker is (even if she's heavier, how do I quickly compute how much weigh is needed), and also that I have enough speed and strength to push her in such a way that she is going to stop the trolley.

If anything, even if I'm lighter, my aim is much better if I jump myself, or if I toss the backpack and then jump, so I can get on the trolley's path exactly where I want, in order to maximize the effect. On the other hand, I can't aim nearly as well with the body of another person, who will also probably be fighting for her life and will likely not fall the way I want her to – though even if she were not fighting, the loss in accuracy may well compensate for the increase in weight.

At any rate, my body would slow the trolley down. Why would that not be enough?

On that note, another problem is how I know that the trolley will kill them if I do not stop the trolley completely.
Granted, one may add further hypotheses, like an evil genius setting up the experiment and giving me enough information, etc., but those further hypotheses also seem to introduce potentially morally relevant factors (like the evil genius' expected actions depending on my reply, whether he can be trusted, etc. in that particular case), which also may influence one's assessment.

Moreover, if I'm not mistaken, the replies are supposed to be more or less quick, not after introducing a number of extra hypotheses, testing them, etc., which would probably take at least hours if not days or more, and in any case, as far as I can tell even professional philosophers usually don't include a number of extra hypotheses in order to overcome the (IMV) problematic features of the scenarios.

Maybe it's nothing. But perhaps, using more detailed scenarios (even if artificial) that do not have the potential difficulty I described above might shed light on whether that is a problem after all.

Anyhow, it's just a suggestion of a potential avenue to be explored. The comparison between philosophers and non-philosophers is interesting regardless.

Marco Devillers said...

Eric: I think the informal argument is that when observing two groups you're always going to observe a distinct difference between them but that result doesn't generalize but is the result of just the law of big numbers and you haven chosen two groups.

Statisticians explain that away with a bit of extra math I don't understand. But I think the effect you observed may be due to that.

Sorry, I am not a statistician, only know of some anomalies when dealing with statistics, therefor I asked you to ask a mathematician.

Eric Schwitzgebel said...

Angra: I agree that factors like this might be influencing people's judgments, though my guess is that the most natural thing to do is *not* to concoct possibilities like you discuss but rather to lazily take the thing at face value. But you are absolutely correct that thinking of different possibilities or not quite buying the scenario might be part of both philosophers' and non-philosophers' reactions. That's one reason not to put too much weight on this type of experiment if there is other empirical evidence that seems to point in a different direction. But also, I think we should keep in mind the possibility that such effects are more or less noise that operates orthogonally to the real target questions that Fiery and I are after, which is how much more sophisticated and stable are philosophers' judgments about such scenarios. And given that philosophers seem to show pretty much the same patterns in their results (with the interesting exception of Push vs. Drop), including the same distortion by order effects, that points toward the conclusion that philosophers' judgments are arising from basically similar cognitive processes as non-philosophers', and the philosophical reasoning is mostly being recruited post-hoc.

Eric Schwitzgebel said...

Marco: I do know a fair bit of statistics, including issues like power and Type I vs. Type II error, which seems to be what you are gesturing at, and Fiery's and my statistical techniques are pretty standard.

Marco Devillers said...

Well. I looked it up to be clear. It's called the false discovery rate, as the number of hypothesis tested on a group goes up so also does the chance of false discoveries.

It's not type I or type II error, it's the number of tested hypotheses.

I trust your statistics is much better than mine; it's just that most popularly detected 'facts' in the news seem to be due to false discovery rates.

Angra Mainyu said...


In my view, lazily taking the thing at face value may work on the condition that the person assessing the morality of the actions in the scenario is not intuitively (unconsciously even, if she can't help herself) making the assessment that the person facing the choice is making an improper probabilistic assessment of the outcomes.

Perhaps, you're right and the most natural thing to do is to actually take the scenario at face value. I don't really know; I have no data.

However, even if that is not a problem for most, if there is a non-minuscule minority who have that difficulty, that would affect the results, when it comes to assessing the degree of agreement on preliminary intuitive assessments (for example), or even what the most intuitive view is, in some cases.

On the other hand, I agree with your point about philosophers' apparently showing the same patterns as non-philosophers. If an effect like the one I mentioned happened, it is plausibly not affecting that result, unless philosophers' expertise would more likely show in scenarios not affected by intuitive "not buying it" responses – but that does not appear likely to me.

Eric Schwitzgebel said...

Marco: Yes, one can make corrections for multiple comparisons to reduce the likelihood of reporting results that are due to statistical chance. Since we're not running a large number of comparisons, though, it would not be standard to do it on these data.

Eric Schwitzgebel said...

Angra: Thanks. I do sympathize with those reservations! Most of my empirical work has very different types of dependent variables, but I do think asking people about unrealistic scenarios is a tool that belongs in the experimental philosopher's toolkit, as long as we bear in mind the limitations of the method.

Angra Mainyu said...


Thanks, and I fully agree that unrealistic scenarios belong in the toolkit.

Just to clarify, my concern was not that the scenarios were unrealistic (sorry if that was unclear). Rather, it's a particular kind of unrealitic-ness so to speak, namely that in presenting the scenario, some probabilistic assessments are implicitly held as proper when presenting the scenario to the subjects and asking them to make an assessment, but when one intuitively tries to picture the situation (at least, it happens to me, and I do not know to how many other), intuitively one (plausibly) reckons that those probabilistic assessments are improper.

For example, in the push and drop cases, it's implicitly held that it's proper for the person deciding whether to push/drop the hiker to assign probability 1 or almost 1 to the events "If I push/drop the hiker, the trolley will not kill anyone", and "If I do not push/drop the hiker, the trolley will kill 5 people". But those assessments look intuitively improper to me, unless one adds a lot more conditions to the scenario (conditions that one normally won't add when intuitively picturing it in one's head).

I would suggest that, perhaps, the difficulty can be avoided by adding [i]more[/i] unrealistic hypotheses to the scenario (e.g., the people are unconscious, there is a computer that measures the weight and tells you what the outcome will be, and there is such-and-such (but it would have to be specified) evidence that the computer's judgment is reliable, etc.), that in the end make the assessments in question look proper.

But as I mentioned, I agree with your point about the philosophers' patterns, so this is not an objection to this paper.