Friday, February 26, 2021

Philosophy More Popular as a Second Than a First Major -- Especially Among Women

I was working through the NCES IPEDS database (yet again) for a new article on race and gender diversity in philosophy in the United States (yes, more fun data soon!), when I was struck by something: Among students whose second major is Philosophy, 43% are women.  Among students whose first major is philosophy, 36% are women.  (IPEDS has an approximately complete database of Bachelor's degree recipients at accredited U.S. colleges and universities.)

The difference between 36% and 43% might not seem large, but I've spent over a decade looking at percentages of women in philosophy, and anything over 40% is rare.  For decades, until a recent uptick, the percentage of women majoring in philosophy stayed consistently in a band between 30% and 34%.  So that 43% pops out.  (And yes, it's statistically significantly different from 36%: 4353/12238 vs. 1496/3507, p < .001, aggregating the most recent two years' data from 2017-2018 and 2018-2019.)

So I decided to take a closer look, aggregating over the past ten years.  I limited my analysis by excluding universities with no Philosophy Bachelor's completions, universities with no second majors, and University of Washington-Bothell (which seems to have erroneous or at least unrepresentative data).  I found, as I have found before, that Philosophy is substantially more popular as a second major than as a first major.  In this group of universities, only 0.29% of women complete Philosophy as a first major, while 1.3% of women who complete a second major choose Philosophy.  Among men, it's 0.78% and 3.1%, respectively.

If you're curious about the relative popularity of Philosophy as first major, the earlier post has a bunch of analyses.  Today I'll just add a couple correlational analyses, looking only at the subset of schools with at least 100 Bachelor's degrees in Philosophy over the 10 year period, to reduce noise.

School by school, the correlation between the percentage of students who complete a second major (of any sort) and the percentage of students who complete a Philosophy major (either as 1st or 2nd major) is 0.44 (p < .001).  In other words, schools with lots of second majors tend to also have relatively high numbers of Philosophy majors -- just as you'd expect, if Philosophy is much more popular as a second major than as a first major.  The correlation between the percentage of students who complete a second major (of any sort) and the percentage of those who complete a Philosophy major (either as 1st or 2nd major) who are women is 0.18 (p = .004).  In other words, schools in which a second major is common also tend to have Philosophy majors that are more evenly divided between men and women.

[image source]

Wednesday, February 17, 2021

Three Faces of Validity: Internal, Construct, and External

I have a new draft paper in circulation, "The Necessity of Construct and External Validity for Generalized Causal Claims", co-written with two social scientists, Kevin Esterling and David Brady.  Here's a distillation of the core ideas.


Consider a simple causal claim: "α causes β in γ".  One type of event (say, caffeine after dinner) tends to cause another type of event (disrupted sleep) in a certain range of conditions (among typical North American college students).

Now consider a formal study you could run to test this.  You design an intervention: 20 ounces of Peet's Dark Roast in a white cup, served at 7 p.m.  You design a control condition: 20 ounces of Peet's decaf, served at the same time.  You recruit a population: 400 willing undergrads from Bigfoot Dorm, delighted to have free coffee.  Finally, you design a measure of disrupted sleep: wearable motion sensors that normally go quiet when a person is sleeping soundly.

You do everything right.  Assignment is random and double blind, everyone drinks all and only what's in their cup, etc., and you find a big, statistically significant treatment effect: The motion sensors are 20% more active between 2 and 4 a.m. for the coffee drinkers than the decaf drinkers.  You have what social scientists call internal validity.  The randomness, excellent execution, and large sample size ensure that there are no systematic differences between the treatment and control groups other than the contents of their cups (well...), so you know that your intervention had a causal effect on sleep patterns as measured by the motion sensors.  Yay!

You write it up for the campus newspaper: "Caffeine After Dinner Interferes with Sleep among College Students".

But do you know that?

Of course it's plausible.  And you have excellent internal validity.  But to get to a general claim of that sort, from your observation of 400 undergrads, requires further assumptions that we ought to be careful about.  What we know, based on considerations of internal validity alone, is that this particular intervention (20 oz. of Peet's Dark Roast) caused this particular outcome (more motion from 2 to 4 a.m.) the day and place the experiment was performed (Bigfoot Dorm, February 16, 2021).  In fact, even calling the intervention "20 oz. of Peet's Dark Roast" hides some assumptions -- for of course, the roast was from a particular batch, brewed in a particular way by a particular person, etc.  All you really know based on the methodology, if you're going to be super conservative, is this: Whatever it is that you did that differed between treatment and control had an effect on whatever it was you measured.

Call whatever it was you did in the treatment condition "A" and whatever it was you did differently in the control condition "-A".  Call whatever it was you measured "B".  And call the conditions, including both the environment and everything that was the same or balanced between treatment and control, "C" (that it was among Bigfoot Dorm students, using white cups, brewed an average temperature of 195°F, etc.).

What we know then is that the probability, p, of B (whatever outcome you measured), was greater given A (whatever you did in the treatment condition) than in -A (whatever you did in the control condition), in C (the exact conditions in which the experiment was performed).  In other words:

p(B|A&C) > p(B|-A&C).  [Read this as "The probability of B given A and C is greater than the probability of B given not-A and C."]

But remember, what you claimed was both more specific and more general than that.  You claimed "caffeine after dinner interferes with sleep among college students".  To put it in the Greek-letter format with which we began, you claimed that α (caffeine after dinner) causes β (poor sleep) in γ (among college students, presumably in normal college dining and sleeping contexts in North America, though this was not clearly specified).

In other words, what you think is true is not merely the vague whatever-whatever sentence

p(B|A&C) > p(B|-A&C)

but rather the more ambitious and specific sentence

p(β|α&γ) > p(β|-α&γ).[1]

In order to get from one to the other, you need to do what Esterling, Brady, and I call causal specification.

You need to establish, or at least show plausible, that α is what mattered about A.  You need to establish that it was the caffeine that had the observed effect on B, rather than something else that differed between treatment and control, like tannin levels (which differed slightly between the dark roast and decaf).  The internally valid study tells you that the intervention had causal power, but nothing inside the study could possibly tell you what aspect of the intervention had the causal power.  It may seem likely, based on your prior knowledge, that it would be the caffeine rather than the tannins or any of the potentially infinite number of other things that differ between treatment and control (if you're creative, the list could be endless).

One way to represent this is to say that alongside α (the caffeine) are some presumably inert elements, θ (the tannins, etc.), that also differ between treatment and control.  The intervention A is really a bundle of α and θ: A = α&θ.  Now substituting α&θ for A, what the internally valid experiment established was

p(B|(α&θ)&C) > p(B|-(α&θ)&C).

If θ is causally inert, with no influence on the measured outcome B, you can can drop the θ, thus inferring from the sentence above to 

p(B|α&C) > p(B|-α&C).

In this case, you have what Esterling, Brady, and I call construct validity of the cause.  You have correctly specified the element that is doing the causal work.  It's not just A as a whole, but α in particular, the caffeine.  Of course, you can't just assert this.  You ought to establish it somehow.  That's the process of establishing construct validity of the cause.

Analogous reasoning applies to the relationship between B (measured motion-sensor outputs) and β (disrupted sleep).  If you can establish the right kind of relationship between B and β you can move from a claim about B to a conclusion about β, thus moving from 

p(B|α&C) > p(B|-α&C)


p(β|α&C) > p(β|-α&C).

If this can be established, you have correctly specified the outcome and have achieved construct validity of the outcome.  You're really measuring disrupted sleep, as you claim to be, rather than something else (like non-disruptive limb movement during sleep).

And finally, if you can establish that the right kind of relationship holds between the actual testing conditions and the conditions to which you generalize (college students in typical North American eating and sleeping environments) -- then you can move from C to γ.  This will be so if your actual population is representative and the situation isn't strange.  More specifically, since what is "representative" and "strange" depends on what causes what, the specification of γ requires knowing what background conditions are required for α to have its effect on β.  If you know that, you can generalize to populations beyond your sample where the relevant conditions γ are present (and refrain from generalizing to cases where the relevant conditions are absent).  You can thus substitute γ for C, generating the causal generalization that you had been hoping for from the beginning:

p(β|α&γ) > p(β|-α&γ).

In this way, internal, construct, and external validity fit together.  Moving from finite, historically particular data to a general causal claim requires all three.  It requires establishing not only internal validity but also establishing construct validity of the cause and outcome and external validity.  Otherwise, you don't have the well-supported generalization you think you have.

Although internal validity is often privileged in social scientists' discussions of causal inference, with internal validity alone, you know only that the particular intervention you made (whatever it was) had the specific effect you measured (whatever that effect amounts to) among the specific population you sampled at the time you ran the study.  You know only that something caused something.  You don't know what causes what.


Here's another way to think about it.  If you claim that "α causes β in γ", there are four ways you could go wrong:

(1.) Something might cause β in γ, but that something might not be α.  (The tannin rather than the caffeine might disrupt sleep.)

(2.) α might cause something in γ, but it might not cause β.  (The caffeine might cause more movement at night without actually disrupting sleep.)

(3.) α might cause β in some set of conditions, but not γ.  (Caffeine might disrupt sleep only in unusual circumstances particular to your school.  Maybe students are excitable because of a recent earthquake and wouldn't normally be bothered.)

(4.) α might have some relationship to β in γ, but it might not be a causal relationship of the sort claimed.  (Maybe, though an error in assignment procedures, only students on the noisy floors got the caffeine.)

Practices that ensure internal validity protect only against errors of Type 4.  To protect against errors of Type 1-3, you need proper causal specification, with both construct and external validity.


Note 1: Throughout the post, I assume that causes monotonically increase the probability of their effects, including the presence of other causes.



[image modified from source]

Saturday, February 06, 2021

How to Respond to the Incredible Bizarreness of Panpsychism: Thoughts on Luke Roelofs' Combining Minds

Like a politician with bad news, Notre Dame Philosophical Reviews released my review of Luke Roelofs' Combining Minds Friday in the late afternoon.

It was a delight to review such an interesting book! I'll share the intro and conclusion here. For the middle, go to NDPR.


Panpsychism is trending. If you're not a panpsychist, you might find this puzzling. According to panpsychism, consciousness is ubiquitous. Even solitary elementary particles have or participate in it. This view might seem patently absurd -- as obviously false a philosophical view as you're likely to encounter. So why are so many excellent philosophers suddenly embracing it? If you read Luke Roelofs' book, you will probably not become a panpsychist, but at least you will understand.

Panpsychism, especially in Roelofs' hands, has the advantage of directly confronting two huge puzzles about consciousness that are relatively neglected by non-panpsychists. And panpsychism's biggest apparent downside, its incredible bizarreness (by the standards of ordinary common sense in our current culture), might not be quite as bad a flaw as it seems. I will introduce the puzzles and sketch Roelofs' answers, then discuss the overall argumentative structure of the book. I will conclude by discussing the daunting bizarreness.


4. The Incredible Bizarreness of Panpsychism

The book explores the architecture of panpsychism in impressive detail, especially the difficulties around combination. Roelofs' arguments are clear and rigorously laid out. Roelofs fairly acknowledges difficulties and objections, often presenting more than one response, resulting in a suite of possible related views rather than a single definitively supported view. The book is a trove of intricate, careful, intellectually honest metaphysics.

Nevertheless, the reader might simply find panpsychism too bizarre to accept. It would not be unreasonable to feel more confident that electrons aren't conscious than that any clever philosophical argument to the contrary is sound. No philosophical argument in the vicinity will have the nearly irresistible power of a mathematical proof or compelling series of scientific experiments. Big picture, broad scope, general theories of consciousness always depend upon weighing plausibilities against each other. So if a philosophical argument implies that electrons are conscious, you might reasonably reject the argument rather than accept the conclusion. You might find panpsychism just too profoundly implausible.

That is my own position, I suppose. I can't decisively refute panpsychism by pointing to some particle and saying "obviously, that's not conscious!" any more than Samuel Johnson could refute Berkeleyan metaphysical idealism by kicking a stone. Still, panpsychism (and Berkeleyan idealism) conflicts too sharply with my default philosophical starting points for me to be convinceable by anything short of an airtight proof of the sort it's unrealistic to expect in this domain. Yes, of course, as the history of science amply shows, our ordinary default commonsense understanding isn't always correct! But we must start somewhere, and it is reasonable to demand compelling grounds before abandoning those starting points that feel, to you, to be among the firmest.

Still, I don't think we should feel entirely confident or comfortable taking this stand. If there's one thing we know about the metaphysics of consciousness, it is that something bizarre must be true. Among the reasons to think so: Every well-developed theory of consciousness in the entire history of written philosophy on Earth has either been radically bizarre on its face or had radically bizarre consequences. (I defend this claim in detail here.) This includes dualist theories like those of Descartes (who notoriously denied animal consciousness) and "common sense" philosopher Thomas Reid (who argued that material objects can't cause anything or even cohere into stable shapes without the constant intervention of immaterial souls) as well as materialist or physicalist theories of the sort that have dominated Anglophone philosophy since the 1960s (which typically involve either commitment to attributing consciousness to strange assemblages, or denial of local supervenience, or both, and which seem to leave common sense farther behind the more specific they become). If no non-bizarre general theory of consciousness is available, or even (I suspect) constructible in principle, then we should be wary of treating bizarreness alone as sufficient grounds to reject a theory.

How sparse or abundant is consciousness in the universe? This is among the most central cosmological questions we can ask. A universe rich with conscious entities is very different from one in which conscious experience requires a rare confluence of unlikely events. Currently, theories run the full spectrum from the radical abundance of panpsychism to highly restrictive theories that raise doubts about whether even other mammals are conscious (e.g., Dennett 1996; Carruthers 2019). Various strange cases, like hypothetical robots and aliens, introduce further theoretical variation. Across an amazingly wide range of options, we can find theories that are coherent, defensible against the most obvious objections, and reconcilable with current empirical science. All theories -- unavoidably, it seems -- have some commitments that most of us will find bizarre and difficult to believe. The most appropriate response to all of this is, I think, doubt and wonder. In doubtful and wondrous mood, we might reasonably set aside a sliver of credence space for panpsychism.


Full review here.

Friday, February 05, 2021

Adversarial Collaboration

[originally posted at Brains Blog, with a lovely reply by Justin Sytsma, in which he compares your mind to Emmenthaler cheese]

You believe P. Your opponent believes not-P. Each of you thinks that new empirical evidence, if collected in the right way, will support your view. Maybe you should collaborate? An adversary can keep you honest and help you see the gaps and biases in your arguments. Adversarial collaboration can also add credibility, since readers can’t as easily complain about experimenter bias. Plus, when the data land your way, your adversary can’t as easily say that the experiment was done wrong!

My own experience with adversarial collaboration has been mostly positive. From 2004-2011, I collaborated with Russ Hurlburt on experience sampling methods (he’s an advocate, I’m a skeptic). Since 2017, I’ve been collaborating with Brad Cokelet and Peter Singer on whether teaching meat ethics to university students influences their campus food purchases (they thought it would, while I was doubtful). The first collaboration culminated in a book with MIT Press and double-issue symposium in Journal of Consciousness Studies. The second has so far produced an article in Cognition and hopefully more to come. Other work has been partly adversarial or conducted with researchers whose empirical guesses differed from mine.

I’ve also had two adversarial collaborations fail – fortunately in the early stages. Both failed for the same reason: lack of well-defined common ground. Securing common ground is essential to publication and uniquely challenging in adversarial collaboration.

I have three main pieces of advice:

(1.) Choose a partner who thrives on open dialogue.

(2.) Define your methods early in the project, especially the means of collecting the crucial data.

(3.) Segregate your empirical results from your theoretical conclusions.

To publish anything, you and your co-authors must speak as one. Without open dialogue, clearly defined methods, and segregation of results from theory, adversarial projects risk slipping into irreconcilable disagreement.

Open Dialogue

In what Jon Ellis and I have called open dialogue, you aim to present not just arguments in support of your position P but your real reasons for holding the view you hold, inviting scrutiny not only of P but also of the particular considerations you find convincing. You say “here’s why I think that P” with the goal of offering considerations C1, C2, and C3 in favor of P, where C1-3 (a.) epistemically support P and also (b.) causally sustain your opinion that P. Instead of having only one way to prove you wrong – showing that P is false or unsupported – your interlocutor now has three ways to prove you wrong. They can show P to be false or unsupported; they can show C1-3 to be false or unsupported; or they can show that C1-3 don’t in fact adequately support P. If they meet the challenge, your mind will change.

Contrast the lawyerly approach, the approach of someone who only aims to convince you or some other audience (or themselves, in post-hoc rationalization). The lawyerly interlocutor will normally offer reasons in favor of P, but if those reasons are defeated, that’s only a temporary inconvenience. They’ll just shift to a new set of reasons, if new reasons can be found. And in complicated matters of philosophy and human science, people can almost always find multiple reasons not to reject their pet ideas if they’re motivated enough. This can be frustrating for partners who had expected open dialogue! The lawyer’s position has, so to speak, secret layers of armor – new reasons they’ll suddenly devise if their first reasons are defeated. The open interlocutor, in contrast, aims to reveal exactly where the chinks in their armor are. They present their vulnerabilities: C1-3 are exactly the places to poke at if you want to win them over. Their opinion could shift, and such-and-such is what it would take.

In empirical adversarial collaboration, the most straightforward place to find common ground is in agreement that some C1 is a good test of P. You and your adversary both agree that if C1 proves to be empirically false, belief in P ought to be reduced or withdrawn, and if C1 proves to be empirically true, P is supported. Without open dialogue, you cannot know where your adversary’s reasoning rests. You can’t rely on the common ground that C1 is a good test of P. You thought you were testing P by means of testing C1. You thought that if C1 failed, your adversary would withdraw their commitment to P and you could write that up as your mutual result. If your adversary instead shifts lawyerlike to a new C2, the common ground you thought you had, the theoretical core you thought you shared, has disappeared, and your project has surprisingly changed shape. In one failed collaboration, I thought my adversary and I had agreed that such-and-such empirical evidence (from one of their earlier unpublished studies) wasn’t a good test of P, and so we began piloting alternative tests. However, they were secretly continuing to collect data on that earlier study. With the new data, their p value crossed .05, they got a quick journal acceptance – and voilà, they no longer felt that further evidence was necessary.

Now of course we all believe things for multiple reasons. Sometimes when new evidence arrives we find that our confidence in P doesn’t shift as much as we thought it would. This can’t be entirely known in advance, and it would be foolish to be too rigid. Still, we all have the experience of collaborators and conversation partners who are more versus less open. Choose an open one.

Define Your Methods Early

If C1, then P; and if not-C1 then not-P. Let’s suppose that this is your common ground. One of you thinks that you’ll discover C1 and P will be supported; the other thinks that you’ll discover the falsity of C1 and P will be disconfirmed. Relatively early in your collaboration, you need to find a mutually agreeable C1 that is diagnostic of the truth of P. If you’re thinking C1 is the way to test P and C2 wouldn’t really show much, while your adversary thinks C2 is really more diagnostic, you won’t get far. It’s not enough to disagree about the truth of P while aiming in sincere fellowship to find a good empirical test. You must also agree on what a good test would be – ideally a test in which either a positive or a negative result would be interesting. An actual test you can actually run! The more detailed, concrete, and specific, the better. My other failed collaboration collapsed for this reason. Discussion slowly revealed that the general approach one of us preferred was never going to satisfy the other two.

If you’re unusually lucky, maybe you and your adversary can agree on an experimental design, run the experiment, and get clean, interpretable results that you both agree show that P. It worked, wow! Your adversary saw the evidence and changed their mind.

In reality of course, testing is messy, results are ambiguous, and after the fact you’ll both think of things you could have done better or alternative interpretations you’d previously disregarded – especially if the test doesn’t turn out as you expected. Thinking clearly in advance about concrete methods and how you and your adversary would interpret alternative results will help reduce, but probably won’t eliminate, this shifting.

Segregate Your Empirical Results from Your Theoretical Conclusions

If you and your adversary choose your methods early and favor an open rather than a lawyerly approach, you’ll hopefully find yourselves agreeing, after the data are collected, that the results do at least superficially tend to support (or undermine) P. One of you is presumably somewhat surprised. Here’s my prediction: You’ll nevertheless still disagree about what exactly the research shows. How securely can you really conclude P? What alternative explanations remain open? What mechanism is most plausibly at work?

It’s fine to disagree here. Expect it! You entered with different understandings of the previous theoretical and empirical literature. You have different general perspectives, different senses of how considerations weigh against each other. Presumably that’s why you began as adversaries. That’s not all going to evaporate. My successful collaborations were successful in part, I think, because we were unsurprised by continuing disagreement and thus unphased when it occurred, even though we were unable to predict in advance the precise shape of our evolving thoughts.

In write-up, you and your adversary will speak with one voice about motivations, methods, and results. But allow yourself room to disagree in the conclusion. Every experiment in the human sciences admits of multiple interpretations. If you insist on complete theoretical agreement, your project might collapse at this last stage. For example, the partner who is surprised the by results might insist on more follow-up studies than is realistic before they are fully convinced.

Science is hard. Science with an adversary is doubly hard, since sufficient common ground can be difficult to find. However, if you and your partner engage in open dialogue, the common ground is less likely to suddenly shift away than if one or both of you prevaricate. Early specification of methods helps solidify the ground before you invest too heavily in a project doomed by divergent empirical approaches. And allowing space at the end for alternative interpretations serves as a release valve, so you can complete the project despite continuing disagreement.

In a good adversarial collaboration, if you win you win. But if you lose, you also win. You’ve shown something new and (at least to you) surprising. Plus, you get to parade your virtuous susceptibility to evidence by uttering those rare and awesome words, “I was wrong.”

[image source]

Wednesday, February 03, 2021

And the Winner of the Philosophy Through Science Fiction Stories Flash Fiction Contest Is...

Tim Stevens with the story "Consciousness Weighs Nothing". He will receive a $100 cash prize and a copy of the book, and his story will be invited to the final round of consideration for publication in Flash Fiction Online.

Contestants were given two hours to write a 500-1000 word story on the spot, at the end of the January 18 book launch event for Philosophy Through Science Fiction Stories (ed. Helen De Cruz, Johan De Smedt, and Eric Schwitzgebel). We were delighted with the quality of the submissions, and we are pleased to give honorable mentions to the following authors, who will also be invited to the final round at Flash Fiction Online:

Trystan Goetze, for "Anaxamanda"

Melody Plan, for "Letter from Alpha Centauri"

Marren MacAdam, for "But What Are Gods Made of?"

Thanks to all who participated!

Tuesday, February 02, 2021

Philosophy Through Science Fiction Stories: YouTube Discussion with Joseph Orosco

I had a fun chat last weekend about the relationship between philosophy and science fiction, with Joseph Orosco at the Annares Project for Alternative Futures.  

Full conversation here.  Among other things, we discussed:

07:14: Science fiction in philosophical pedagogy vs. science fiction as itself a way of doing philosophy.

11:15: Philosophy as not just about advancing positions, but also a means of exploring positions (without necessarily advancing any) or provoking doubt and wonder, and the value of science fiction on this vision of philosophy.

12:40: Philosophical thinking as a spectrum from very abstract claims (e.g., "maximize overall happiness") through paragraph-long thought experiments all the way to fully developed fictions that engage the emotions and social cognition.

20:25: Imagination as the core of philosophy: Abstract claims are empty except insofar as they are fleshed out imaginatively through examples.

28:10: The philosophical and imaginative differences among the genres of "literary fiction", science fiction, and speculative fiction generally.

33:00: The philosophical and sociological aims of Philosophy Through Science Fiction Stories: Exploring the Boundaries of the Possible.