Friday, November 25, 2022

The Egg Came First (Repost)

I have COVID. It's Thanksgiving holiday in the U.S. And revisions of my next book manuscript are due in five days. So it's time to lean upon the old blogger's fallback of reposting an old favorite. "The Egg Came First" from 2012. I was reminded of it by Anna Strasser, who says that David Papineau is being interviewed once again on the timeless philosophical conundrum of chicken-or-egg. I hope David has recanted his recant of his original view!

[Dall-E image of a series of chickens and eggs, in the style of Van Gogh]

The Egg Came First

It is only natural that, when confronted with timeless and confounding questions, your friends should turn to you, the philosopher. Sooner or later, then, they will ask you which came first, the chicken or the egg. You must be prepared to discuss this issue in pedantic depth or lose your reputation for intimidating scholarly acumen. Only after understanding this issue will you be prepared for even deeper and more troubling questions such as "Is water wet? Or is water only something that makes other things wet?"

The question invites us to consider a sequence of the following sort, stretching back in time: chicken, egg, chicken, egg, chicken.... The first term of the series can be chosen arbitrarily. The question is the terminus. If one assumes an infinite past and everlasting species, there may be no terminus. However, the cosmological assumptions behind such a view are highly doubtful. Therefore, it seems, there must be a terminus member of the series, temporally first, either a chicken or an egg. The question which came first is often posed rhetorically as though it were obvious that there could be no good epistemic grounds for choice. However, as I aim to show, this appearance of irresolvability is misleading. The egg came first.

Young Earth Creationist views merit brief treatment. If God created chickens on the Fourth Day along with "every kind of winged creature", then the question is whether He chose to create the chicken first, the egg first, both types simultaneously, or a being at the very instant of transition between egg and chicken (when it is arguably either both or neither). The question thus dissolves into the general mystery of God's will. Textual evidence somewhat favors either the chicken or both, since God said "let birds fly above the earth" and the Bible then immediately states "and so it was", before transition to the Fifth Day. So at least some winged creatures were already flying on the Fourth Day, and one day is ordinarily insufficient time for eggs to mature into flying birds. Since chickens aren't much prone to fly, though, it's dubious whether such observations extend to them, unless God implemented a regular rule in which winged creatures were created either mature or simultaneously in a mix of mature and immature states. And in any case, it is granted on all sides that events were unusual and not subject to the normal laws of development during the first Six Days.

If we accept the theory of evolution, as I think we should, then the chicken derives from a lineage that ultimately traces back to non-chickens. (The issues here are the same whether we consider the domestic chicken to be its own species or whether we lump it together with the rest of gallus gallus including the Red Junglefowl from which the domestic chicken appears to be mostly descended.) The first chicken arose either as a hybrid of two non-chickens or via mutation from a non-chicken. Consider the mutation case first. It's improbable (though not impossible) that between any two generations in avian history, X and X-1, there would be enough differentiation for a clean classification of X as a chicken and X-1 as a non-chicken. Thus we appear to have a Sorites case. Just as it seems that adding one grain to a non-heap can't make it a heap, resulting in the paradox that no addition of single grains could ever make a heap, so also one might worry that one generation's difference could never (at least with any realistic likelihood) make the difference between a chicken and a non-chicken, resulting in the paradox of chickens in the primordial soup.

Now there are things philosophers can do about these paradoxes. Somehow heaps arise, despite the argument above. One simple approach is epistemicism, according to which there really is a sharp line in the world such that X-1 is a non-heap and X is a heap, X-1 is a non-chicken and X is a chicken. On this view, our inability to discern this line is merely an epistemic failure on our part. Apparent vagueness is really only ignorance. Another simple approach is to allow that there really are vague properties in the world that defy classification in the two-valued logic of true and false. On this view, between X, which is definitely a chicken, and X-N, which is definitely a non-chicken, there are some vague cases of which it is neither true nor false that it is a chicken, or somehow both true and false, or somewhere between true and false, or something like that. There are also more complicated views, too, than these, but we needn't enter them, because one key point remains the same across all these Sorites approaches: The Sorites cases progress not as follows: X chicken, X-1 egg, X-2 chicken, X-3 egg, X-4 chicken.... Rather, they progress in chicken-egg pairs. From a genetic perspective, since the chicken and egg share DNA, they form a single Sorites unit. Within this unit, the egg clearly comes first, since the chicken is born from the egg, sharing its DNA, and there is a DNA difference between the egg and the hen from which that egg is laid. For a ridiculous argument to the contrary, see here.

If we turn to the possibility of speciation by hybridization, similar considerations apply.

A much poorer argument for the same conclusion runs as follows: Whatever ancestor species gave rise to chickens presumably laid eggs. Therefore, there were eggs long before there were chickens. Therefore, the egg came first. The weakness in this argument is that it misconstrues the original question. The question is not "Which came first, chickens or eggs?" but rather "Which came first, the first chicken or the first chicken egg?"

However, the poverty of this last argument does raise vividly the issue of how one assigns eggs to species. The egg-first conclusion could be evaded if we typed eggs by reference to the mother: If the mother is a chicken, the egg is a chicken egg; if the mother is not a chicken, the egg is not a chicken egg. David Papineau succinctly offers the two relevant considerations against such a view here. First, if we type by DNA, which would seem to be the default biological standard, the egg shares more of its DNA with the hatchling than with its parent. Second, as anyone can see via intuitive armchair reflection on a priori principles: "If a kangaroo laid an egg from which an ostrich hatched, that would surely be an ostrich egg, not a kangaroo egg."

(HT: Walter Sinnott-Armstrong, who in turn credited Roy Sorenson.)

Update, Feb. 2, 2012:
In the comments, Papineau reveals that he has recanted in light of considerations advanced by Mohan Matthen in his important but so far sadly neglected "Chicken, Eggs, and Speciation" -- considerations also briefly mentioned by Ron Mallon in his comment. Although I find merit in these remarks, I am not convinced and I believe Papineau has abandoned the egg-first view too precipitously.

Matthen argues that: "Speciation occurs when a population comes to be reproductively isolated because the last individual that formerly bridged that population to others died, or because this individual ceased to be fertile (or when other integrating factors cease to operate)" (2009, p. 110). He suggests that this event will normally occur when both soon-to-be-chickens and soon-to-be-chicken-eggs exist in the population. Thus, he concludes, a whole population of chickens and eggs is simultaneously created in a single instant. In assessing this view let me note first that depending on the size of the population and its egg-laying habits, this view might suggest a likelihood of chickens first. Suppose that in a small population of ancestral pre-chickens the last bridge individual dies outside of laying season; or suppose that the end of an individual's last laying season marks the end of an individual's fertility. If there are no out-of-season eggs at the crucial moment, then chickens came first.

More importantly, however, Matthen's criterion of speciation leads to highly counterintuitive and impractical results. Matthen defines reproductive isolation between populations in terms of the probability of gene transfer between those populations. (Also relevant to his distinction is the shape of the graph of the likelihood of gene transfer by number of generations, but that complication isn't relevant to the present issue.) But probability of gene transfer can be very sharply affected by factors that don't seem to create comparably sizable influences on species boundaries. So, for example, when human beings migrated to North America, the probability of gene transfer with the ancestral population declined sharply, and soon became essentially zero (and in any case in excess of the probability of gene transfer between geographically coincident hybridizing species). By Matthen's criterion, this would be a speciating event. After Columbus, gene transfer probability slowly rose and by now gene transfer is very high between individuals with Native American ancestry and those without. Thus, by Matthen's criterion, Native Americans were for several thousand years a distinct species -- not homo sapiens! -- and now they are homo sapiens again. If the moment of change was Columbus's first landing (or some other discrete moment), then the anchoring of a ship, or some other event, perhaps a romantic interlude between Pocahontas and John Smith, caused everyone on the two continents simultaneously to change species!

More simply, we might imagine a chicken permanently trapped in an inescapable cage. Its probability of exchanging genes with other individuals is now zero. Since Matthen allows for species consisting of a single individual, this chicken has now speciated. Depending on how we interpret the counterfactual probabilities, we might even imagine opening and shutting the door repeatedly (perhaps due to some crazy low-probability event) causing that individual to flash repeatedly back and forth between being a chicken and being a non-chicken, with no differences in morphology, actual behavior, location, or sexual preference during the period. On the surface, it seems that Matthen's criterion might even result in all infertile individuals belonging to singleton species.

There are both philosophical and practical biological reasons not to lightly say that individuals may change species during their lifetimes. One consideration is that of animal identity. If I point at an individual chicken and ask at what point the entity at which I am pointing ceases to exist, there are good practical (and maybe metaphysical) reasons to think that the entity does not cease to exist when a single feather falls off, nor to think that it continues to exist despite being smushed into gravy. The most natural and practical approach, it seems, is to say that the entity to which I intend to refer (in the normal case) is essentially a chicken and thus that it continues to exist exactly as long as it remains a chicken. Consequently, on the assumption that the individual pre-chicken avians don't cease to exist when they become reproductively isolated, they remain non-chickens despite overall changes in the makeup of the avian population. (These individuals may, nonetheless, give birth to chickens.) Nor does it seem that any important scientific biological purpose would be served by requiring the relabeling of individual organisms, depending on population movements, once those organisms are properly classified. Long-enduring organisms, such as trees, seem best classified as members of the ancestral population they were born into, even if their species has moved on since. Long-lived individuals can remain as living remnants of the ancestral species -- a species with temporally ragged but individual-respecting borders. The attractiveness of this view is especially evident if we consider the possibility of thawing a long-frozen dinosaur egg.

Matthen argues as follows against the those who embrace either an egg-first or a chicken-first view: The first chicken would need to have descendants by breeding with a non-chicken, but since by definition species are reproductively isolated this view leads to contradiction. This consequence is easily evaded with the right theory of vagueness and a suitable interpretation of the reproductive isolation criterion. On my preferred theory of vagueness, there will be individuals of which it's neither determinately true nor determinately false that they are chickens. We can then define reproductive isolation as the view that no individual of which it is determinately true that it is a member of species X can reproduce with an individual of which it is determinately false that it is a member of species X. As long as all breeding is between determinate members and individuals in the indeterminate middle, the reproductive isolation criterion is satisfied. (This is not to concede, however, that species should be defined entirely in terms of reproductive isolation, given the problems in articulating that criterion plausibly, some of which are noted above.)

Second update, Feb. 3, 2012:
The issues prove even deeper and more convoluted than I thought! In the comments section, Matthen has posted a reply to my objections, which we pursue for a couple more conversational turns. Although I'm not entirely ready to accept his account of species, I see merit in his thought that the best unit of evaluation might be the population rather than the individual, and if there is a first moment at which the population as a whole becomes a chicken population (rather than speciation involving temporally ragged but individual-respecting borders), then that might be a moment at which multiple avians and possibly multiple avian eggs simultaneously become chickens and chicken eggs.

An anonymous reader raises another point that seems worth developing. If we think of "chickens" not exclusively in terms of their membership in a biologically discriminable species but at least partly in terms of their domestication, then the following considerations might favor a chicken-first perspective. Some act of domestication -- either an act of behavioral training or an act of selection among fowl -- was the last-straw change from non-chickenhood to chickenhood, creating the first chicken. But this act was very likely performed on a newly-hatched or adult bird, not on an egg, since eggs are not trainable and hard to discriminate usefully among. Therefore the first entity in the chicken-egg sequence was a chicken, not an egg. For some reason, I find it much more natural to accept the possibility that a non-chicken could become a chicken mid-life if chickenhood is conceived partly in terms of domestication than if it is conceived entirely as a matter of traditional biological species. (I'm not sure how stable this argument is, however, across different accounts of vagueness.)

Third update, Nov. 25, 2022:
My second update was too concessive to Matthen. Reviewing his comments now I think I was too soft. I will stick by my guns. Species have temporally ragged borders, and for each individual the egg comes first!

[Check out the comments section on the original post]

Thursday, November 17, 2022

Citation Rates by Academic Field: Philosophy Is Near the Bottom

Citation rates increasingly matter.  Administrators look at them as evidence of scholarly impact.  Researchers familiarizing themselves with a new topic notice which articles are highly cited, and they are more likely to read and cite those articles.  The measures are also easy to track, making them apt targets for gamification and value capture: Researchers enjoy, perhaps a bit too much, tracking their rising h-indices.

This is mixed news for philosophy.  Noticing citation rates can be good if it calls attention to high-quality work that would otherwise be ignored, written by scholars in less prestigious universities or published in less prestigious journals.  And there's value in having more objective indicators of impact than what someone with a named chair at Oxford says about you.  However, the attentional advantage of high-citation articles amplifies the already toxic rich-get-richer dynamic of academia; there's a temptation to exploit the system in ways that are counterproductive to good research (e.g., salami slicing articles, loading up co-authorships, and excessive self-citation); and it can lead to the devaluation of important research that isn't highly cited.

Furthermore, focus on citation rates tends to make philosophy, and the humanities in general, look bad.  We simply don't cite each other as much as do scientists, engineers, and medical researchers.  There are several reasons.

One reason is the centrality of books to the humanities.  Citations in and of books are often not captured by citation indices.  And even when citation to a book is captured, a book typically represents a huge amount of scholarly work per citation, compared to a dozen or more short articles.

Another reason is the relative paucity of co-authorship in philosophy and other humanities.  In the humanities, books and articles are generally solo-authored, compared to the sciences, engineering, and medicine, where author lists are commonly three or five, and sometimes dozens, with each author earning a citation any time the article is cited.

Publication rates are probably also overall higher in the sciences, engineering, and medicine, where short articles are common.  Reference lists might also be longer on average.  And in those fields the cited works are rarely historical.  Combined, these factors create a much larger pool of overall citations to be spread among current researchers.

Perhaps there are other factors a well.  In all, even excellent and influential philosophers often end up with citation numbers that would be embarrassing for most scientists at a comparable career stage.  I recently looked at a case for promotion to full professor in philosophy, where the candidate and one letter writer both touted the candidate's Google Scholar h-index of 8 -- which is actually good for someone at that career stage in philosophy, but could be achieved straight out of grad school by someone in a high-citation field if their advisor is generous about co-authorship.

To quantify this, I looked at the September 2022 update of Ioannidis, Boyack, and Baas's "Updated science-wide author databases of standardized citation indicators".  Ioannidis, Boyack, and Baas analyze the citation data of almost 200,000 researchers in the Scopus database (which consists mostly of citations of journal articles by other journal articles) from 1996 through 2021. Each researcher is attributed one primary subfield, from 159 different subfields, and each researcher is ranked according to several criteria.  One subfield is "philosophy".

Before I get to the comparison of subfields, you might be curious to see the top 100 ranked philosophers, by the composite citation measure c(ns) that Ioannidis, Boyack, and Baas seem to like best:

1. Nussbaum, Martha C.
2. Clark, Andy
3. Lewis, David
4. Gallagher, Shaun
5. Searle, John R.
6. Habermas, Jürgen
7. Pettit, Philip
8. Buchanan, Allen
9. Goldman, Alvin I.
10. Williamson, Timothy
11. Thagard, Paul
12. Lefebvre, Henri
13. Chalmers, David
14. Fine, Kit
15. Anderson, Elizabeth
16. Walton, Douglas
17. Pogge, Thomas
18. Hansson, Sven Ove
19. Schaffer, Jonathan
20. Block, Ned
21. Sober, Elliott
22. Woodward, James
23. Priest, Graham
24. Stalnaker, Robert
25. Bechtel, William
26. Pritchard, Duncan
27. Arneson, Richard
28. McMahan, Jeff
29. Zahavi, Dan
30. Carruthers, Peter
31. List, Christian
32. Mele, Alfred R.
33. Hardin, Russell
34. O'Neill, Onora
35. Broome, John
36. Griffiths, Paul E.
37. Davidson, Donald
38. Levy, Neil
39. Sosa, Ernest
40. Hacking, Ian
41. Craver, Carl F.
42. Burge, Tyler
43. Skyrms, Brian
44. Strawson, Galen
45. Prinz, Jesse
46. Fricker, Miranda
47. Honneth, Axel
48. Machery, Edouard
49. Stanley, Jason
50. Thompson, Evan
51. Schatzki, Theodore R.
52. Bohman, James
53. Norton, John D.
54. Bach, Kent
55. Recanati, François
56. Sider, Theodore
57. Lowe, E. J.
58. Hawthorne, John
59. Dreyfus, Hubert L.
60. Godfrey-Smith, Peter
61. Wright, Crispin
62. Cartwright, Nancy
63. Bunge, Mario
64. Raz, Joseph
65. Bostrom, Nick
66. Schwitzgebel, Eric
67. Nagel, Thomas
68. Okasha, Samir
69. Velleman, J. David
70. Putnam, Hilary
71. Schroeder, Mark
72. Ladyman, James
73. van Fraassen, Bas C.
74. Hutto, Daniel D.
75. Annas, Julia
76. Bird, Alexander
77. Bicchieri, Cristina
78. Audi, Robert
79. Enoch, David
80. McDowell, John
81. Noë, Alva
82. Carroll, Noël
83. Williams, Bernard
84. Pollock, John L.
85. Jackson, Frank
86. Gardiner, Stephen M.
87. Roskies, Adina
88. Sagoff, Mark
89. Kim, Jaegwon
90. Parfit, Derek
91. Jamieson, Dale
92. Makinson, David
93. Kriegel, Uriah
94. Horgan, Terry
95. Earman, John
96. Stich, Stephen P.
97. O'Neill, John
98. Popper, Karl R.
99. Bratman, Michael E.
100. Harman, Gilbert

All, or almost all, of these researchers are influential philosophers.  But there are some strange features of this ranking.  Some people are clearly higher than their impact warrants; others lower.  So as not to pick on any philosopher who might feel slighted by my saying that they are too highly ranked, I'll just note that on this list I am definitely over-ranked (at #66) -- beating out Thomas Nagel (#67) among others.  Other philosophers are missing because they are classified under a different subfield.  For example Daniel C. Dennett is classified under "Artificial Intelligence and Image Processing".  Saul Kripke doesn't make the list at all -- presumably because his impact was through books not included in the Scopus database.

Readers who are familiar with mainstream Anglophone academic philosophy will, I think, find my ranking based on citation rates in the Stanford Encyclopedia more plausible, at least as a measure of impact within mainstream Anglophone philosophy.  (On the SEP list, Nagel is #11 and I am #251.)

To compare subfields, I decided to capture the #1, #25, and #100 ranked researchers in each subfield, excluding subfields with fewer than 100 ranked researchers.  (Ioannidis et al. don't list all researchers, aiming to include only the top 100,000 ranked researchers overall, plus at least the top 2% in each subfield for smaller or less-cited subfields.)

A disadvantage of my approach to comparing subfields by looking at the 1st, 25th, and 100th ranked researchers is that being #100 in a relatively large subfield presumably indicates more impact than being #100 in a relatively small subfield.  But the most obvious alternative method -- percentile ranking by subfield -- plausibly invites even worse trouble, since there are huge numbers of researchers in subfields with high rates of student co-authorship, making it too comparatively easy to get into the top 2%.  (For example, decades ago my wife was published as a co-author on a chemistry article after a not-too-demanding high school internship.)  We can at least in principle try to correct for subfield size by looking at comparative faculty sizes at leading research universities or attendance numbers at major disciplinary conferences.

The preferred Ioannidis, Boyack, and Baas c(ns) ranking is complex, and maybe better than simpler ranking systems.  But for present purposes I think it's most interesting to consider the easiest, most visible citation measures, total citations and h-index (with no exclusion of self-citation), since that's what administrators and other researchers see most easily.  H-index, if you don't know it, is the largest number h such that h of the author's articles have at least h citations each.  (For example, if your top 20 most-cited articles are each cited at least 20 times, but your 21st most-cited article is cited less than 21 times, your h-index is 20.)

Drumroll please....  Scan far, far, down the list to find philosophy.  This list is ranked in order of total citations by the 25th most-cited researcher, which I think is probably more stable than 1st or 100th.  [click image to scale and clarify]

Philosophy ranks 126th of the 131 subfields.  The 25th-most-cited researcher in philosophy, Alva Noe, has 3,600 citations in the Scopus database.  In the top field, developmental biology, the 25th-most-cited researcher has 142,418 citations -- a ratio of almost 40:1.  Even the 100th-most-cited researcher in developmental biology has more than five times as many citations as the single most cited philosopher in the database.

The other humanities also fare poorly: History at 129th and Literary Studies at 130th, for example.  (I'm not sure what to make of the relatively low showing of some scientific subfields, such as Zoology.  One possibility is that it is a relatively small subfield, with most biologists classified in other categories instead.)

Here's the chart for h-index [click to scale and clarify]:

Again, philosophy is 126th out of 131.  The 25th-ranked philosopher by h-index, Alfred Mele, has an h of only 27, compared to an h of 157 for the 25th-ranked researcher in Cardiovascular System & Hematology.

(Note: If you're accustomed to Google Scholar, Scopus h-indices tend to be lower.  Alfred Mele, for example, has twice as high an h-index in Google Scholar as in Scopus: 54 vs. 27.  Google Scholar h-indices are also higher for non-philosophers.  The 25th ranked researcher in Cardiovascular System & Hematology doesn't have a Google Scholar profile, but the 26th ranked does: Bruce M Psaty, h-index 156 in Scopus vs. 207 in Scholar.)

Does this mean that we should be doubling or tripling the h-indices of philosophers when comparing their impact with that of typical scientists, to account for the metrical disadvantages they have as a result of having fewer coauthors, on average longer articles, books that are poorly captured by these metrics, slower overall publication rates, etc.?  Well, it's probably not that simple.  As mentioned, we would want to at least take field size into account.  Also, a case might be made that some fields are just generally more impactful than others, for example due to interdisciplinary or public influence, even after correction for field size.  But one thing is clear: Straightforward citation-count and h-index comparisons between the humanities and the sciences will inevitably put humanists at a stark, and probably unfair, disadvantage.

Update, December 21, 2022:

Friday, November 11, 2022

Credence-First Skepticism

Philosophers usually treat skepticism as a thesis about knowledge. The skeptic about X holds that people who claim to know X don't in fact know X. Religious skeptics think that people who say they know that God exists don't in fact know that. Skeptics about climate change hold that we don't know that the planet is warming. Radical philosophical skepticism asserts broad failures of knowledge. According to dream skepticism, we don't know we're not dreaming. According to external world skepticism, we lack knoweldge about the world beyond our own minds.

Treating skepticism as a thesis about knowledge makes the concept or phenomenon of knowledge crucially important to the evaluation of skeptical claims. The higher the bar for knowledge, the easier it is to justify skepticism. For example, if knowledge requires perfect certainty, then we can establish skepticism about a domain by establishing that perfect certainty is unwarranted in that domain. (Imagine here the person who objects to an atheist by extracting from the atheist the admission that they can't be certain that God doesn't exist and therefore they should admit that they don't really know.) Similarly, if knowledge requires knowing that you know, then we could establish skepticism about X by establishing that you can't know that you know about X. If knowledge requires being able to rule out all relevant alternatives, then we can establish skepticism by establishing that there are relevant alternatives that can't be ruled out. Conversely, if knowledge is cheaper and easier to attain -- if knowledge doesn't require, for example, perfect certainty, or knowledge that you know, or being able to rule out every single relevant alternative -- then skepticism is harder to defend.

But we don't have to conceptualize skepticism as a thesis about knowledge. We can separate the two concepts. Doing so has some advantages. The concept of knowledge is so vexed and contentious that it can become a distraction if our interests in skepticism are not driven by an interest in the concept of knowledge. You might be interested in religious skepticism, or climate change skepticism, or dream skepticism, or external world skepticism because you're interested in the question of whether god exists, whether the climate is changing, whether you might now be dreaming, or whether it's plausible that you could be radically mistaken about the external world. If your interest lies in those substantive questions, then conceptual debates about the nature of knowledge are beside the point. You don't want abstract disputes about the KK principle to crowd out discussion about what kinds of evidence we have or don't have for the existence of God, or climate change, or a stable external reality, and how relatively confident or unconfident we should be in our opinions about such matters.

To avoid distractions concerning knowledge, I recommend that we think about skepticism instead in terms of credence -- that is, degree of belief or confidence. We can contrast skeptics and believers. A believer in X is someone with a relatively high credence in X, while a skeptic is someone with a relatively low credence in X. A believer thinks X is relatively likely to be the case, while a skeptic regards X as relatively less likely. Believers in God find the existence of God likely. Skeptics find it less likely. Believers in the external world find the existence of an external world (with roughly the properties we ordinarily think it has) relatively likely while skeptics find it relatively less likely.

"Relatively" is an important word here. Given that most readers of this blog will be virtually certain that they are not currently dreaming, a reader who thinks it even 1% likely that they're dreaming has a relatively low credence -- 99% instead of 99.999999% or 100%. We can describe this as a moderately skeptical stance, though of course not as skeptical as the stance of someone who thinks it's 50/50.

[Dall-E image of a man flying in a dream]

Discussions of radical skepticism in epistemology tend to lose sight of what is really gripping about radically skeptical scenarios: the fact that, if the skeptic is right, there's a reasonable chance that you're in one. It's not unreasonable, the skeptic asserts, to attribute a non-trivial credence to the possibility that you are currently dreaming or currently living in a small or unstable computer simulation. Whoa! Such possibilities are potentially Earth-shaking if true, since many of the beliefs we ordinarily take for granted as obviously true (that Luxembourg exists, that I'm in my office looking at a computer screen) would be false.

To really assess such wild-seeming claims, we should address the nature and epistemology of dreaming and the nature and epistemology of computer simulations. Can dream experiences really be as sensorily rich and realistic as the experiences that I'm having right now? Or are dream experiences somehow different? If dream experiences can be as rich and realistic as what I'm now experiencing, then that seems to make it relatively more reasonable to assign a non-trivial credence to this being a dream. Is it realistic to think that future societies could create vastly many genuinely conscious AI entities who think that they live in worlds like this one? If so, then the simulation possibility starts to look relatively more plausible; if not, then it starts to look relatively less plausible.

In other words, to assess the likelihood of radically skeptical scenarios, like the dream or simulation scenario, we need to delve into the details of those scenarios. But that's not typically what epistemologists do when considering radical skepticism. More typically, they stipulate some far-fetched scenario with no plausibility, such as the brain-in-a-vat scenario, and then ask questions about the nature of knowledge. That's worth doing. But to put that at the heart of skeptical epistemology is to miss skepticism's pull.

A credence-first approach to skepticism makes skepticism behaviorally and emotionally relevant. Suppose I arrive at a small but non-trivial credence that I'm dreaming -- a 0.1% credence for example. Then I might try some things I wouldn't try if I had a 0% or 0.000000000001% credence I was dreaming. I might ask myself what I would do if this were a dream -- and if doing that thing were nearly cost-free, I might try it. For example, I might spread my arms to see if I can fly. I might see if I can turn this into a lucid dream by magically lifting a pen through telekinesis. I'd probably only try these things if I had nothing better to do at the moment and no one was around to think I'm a weirdo. And when those attempts fail, I might reduce my credence that this is a dream.

If I take seriously the possibility that this is a simulation, I can wonder about the creators. I become, so to speak, a conditional theist. Whoever is running the simulation is in some sense a god: They created the world and presumably can end it. They exist outside of time and space as I know them, and maybe they have "miraculous" powers to intervene in events around me. Perhaps I have no idea what I could do that might please or displease them, or whether they're even paying attention, but still, it's somewhat awe-inspiring to consider the possibility that my world, our world, is nested in some larger reality, launched by some creator for some purpose we don't understand. If I regard the simulation possibility as a live possibility with some non-trivial chance of being true, then the world might be quite a bit weirder than I would otherwise have thought, and very differently constituted. Skepticism gives me material uncertainty and opens up genuine doubt. The cosmos seems richer with possibility and more mysterious.

We lose all of this weirdness, awe, mystery, and material uncertainty if we focus on extremely implausible scenarios to which we assign zero or virtually zero credence, like the brain-in-a-vat scenario, and focus our argumentative attention only on whether or not it's appropriate to say that we "know" we're not in those admittedly extremely implausible scenarios.

Thursday, November 03, 2022

GPT-3 Can Talk Like the Philosopher Daniel Dennett Without Parroting His Words

Earlier this year, Anna Strasser, Matthew Crosby, and I fined-tuned the large language model GPT-3 on the philosophical writings of Daniel Dennett.  Basically, this amounts to training a chat-bot to talk like Dennett.  We then posed ten philosophical questions to Dennett and our to Dennett model, "digi-Dan".  Regular readers of this blog will recall that we then tested ordinary research participants, blog readers, and experts in the work of Daniel Dennett, to see if they could distinguish Dennett's actual answers from those of digi-Dan.

The results were, to us, surprising.  When asked to select Dennett's answer to a philosophical question from a set of five possible answers, with the other four being digi-Dan outputs, Dennett experts got only about half right -- significantly better than the 20% chance rate, but also significantly below the 80% we had hypothesized.  Experts often chose digi-Dan's answer over actual Dan's answer.  In fact, on two questions, at least one of the four digi-Dan outputs was selected by more experts than was Dennett's own response.  (Blog readers performed similarly to the experts, while ordinary research participants were at about chance.)

Anna Strasser and I then brought my son David Schwitzgebel into the collaboration (Matthew Crosby unfortunately had to withdraw, given a change in career direction), and we wrote up the results in a new paper in draft "Creating a Large Language Model of a Philosopher".  Comments welcome, as always!

Presenting our initial results to audiences, we sometimes heard the following objection: Could digi-Dan be doing so well because it's just parroting Dennett's words?  That is, might we have "over-trained" the model, so that it produces long strings of text more or less word-for-word directly from Dennett's corpus?  If so, then the Dennett experts aren't really mistaking a computer program for Dennett.  Rather, they're mistaking Dennett for Dennett.  They're just picking out something Dennett said previously rather than what he happened to say when asked most recently, and nothing particularly interesting follows.

That's an good and important concern.  We addressed it in two ways.

First, we used the Turnitin plagiarism checker to check for "plagiarism" between the digi-Dan outputs and the Turnitin corpus, supplemented with the texts we had used as training data.  Turnitin checks for matches between unusual strings of words in the target document and in the comparison corpora, using a proprietary method that attempts to capture paraphrasing even when the words don't exactly match.  We found only a 5% overall similarity between digi-Dan's answers and the comparison corpora.  Generally speaking, similarity thresholds below 10%-15% are considered ordinary for non-plagiarized work.  Importantly for our purposes, none of the passages were marked as similar to the training corpus we used in fine-tuning.

However, since the Turnitin plagiarism checking process is non-transparent, we chose also to employ the more transparent process of searching for matching strings of text between digi-Dan's answers and the training corpus.  We found only five matching strings of seven words or longer, plus another sixteen strings of six words or longer.  None of these strings has distinctive philosophical content.  A few are book titles.  The rest are stock phrases of the type favored by analytic philosophers.  If you want to see the details, I've pasted a table at the end of this post containing every matching string of six or more words.

Digi-Dan is thus more sophisticated than the objector supposes.  Somehow, it creates textual outputs that Dennett experts often mistake for Dennett's own writing without parroting Dennett's exact words.  It can synthesize new strings of Dennett-like prose.

It by no means follows that digi-Dan thinks like a philosopher, and we emphasize that some of its outputs are unlike what Dennett would say.  But we still find the results quite interesting.  Maybe humanity is on the cusp of creating machines capable of producing texts that seem to sparkle with philosophical cleverness, insight, or common sense, potentially triggering new philosophical ideas in the reader, and perhaps also paving the way for the eventual creation of artificial entities who are genuinely capable of philosophical thought.



Strings of six or more words that match between the GPT-3 outputs and the Dennett training corpus.  The occurrences column indicates the number of separate training data segments in the training corpus in which that phrase appears.  The occurrences total for shorter strings excludes the occurrences in larger matching strings.  (Therefore, if any n-gram that is a subset of a larger n-gram appears in the table, that means that it appeared independently in the text, rather than appearing only within the larger n-gram.  For example, “intuition pumps and other tools for thinking” occurs once outside of “in my new book intuition pumps and other tools for thinking.”)


# of words


in my new book intuition pumps and other tools for thinking



is organized in such a way that it



there is no such thing as a



figure out what it ought to do



intuition pumps and other tools for thinking



there is no such thing as



i have learned a great deal



organized in such a way that



a capacity to learn from experience



but if you want to get



capacity to learn from experience we



in my book breaking the spell



in such a way that it



is organized in such a way



my book breaking the spell i



of course it begs the question



that is to say there is



that it is not obvious that



the more room there is for



to fall into the trap of



what it ought to do given