There comes a time in everyone's life when their 18-year-old daughter, taking their first psychology class, asks, "Parental-figure-of-mine, what is 'validity'?"
For me that time came last week. Eeek!
Psychologists and social scientists use the term all the time, with a dazzling array of modifiers: internal validity, construct validity, external validity, convergent validity, predictive validity, discriminant validity, face validity, criterion validity.... But ask those same social scientists what validity is exactly, and how all of these notions relate to each other, and most will stumble.
As it happens, I was well positioned to address my daughter's question. I have a new paper, on "validity" in causal inference, forthcoming in the Journal of Causal Inference with social scientists Kevin Esterling and David Brady. This paper has been in progress since (again, eeek!) 2018. In previous posts I've addressed whether validity (in social science usage) is better understood as a property of inferences or as a property of claims (I argue the latter), and the intimate relationship of internal validity, external validity, and construct validity in causal inference.
Today, I'll attempt a brief, theoretically-motivated taxonomy of the better-known types of validity. My aim is more descriptive than argumentative: I'll just outline how I think various "validities" hang together, and maybe some readers will find it to be an attractive and helpful picture.
I start with the assumption that validity is a feature of claims, not of inferences. Philosophers typically describe validity as a property of inferences. Social scientists are all over the map, and even prominent ones are sloppy in their usage. But it best organizes our thinking to address claims primarily and treat inferences as secondary.
I will say that a general causal claim that "A causes B in conditions C" is valid if and only if A does in fact cause B in conditions C. (Compare disquotational theories of truth in philosophy.) Consider for example the causal claim: Enforcement threats on reminder postcards (A) cause increased juror turnout (B) in the 21st-century United States (C).
This statement can be divided into four parts, each of which permits a distinctive type of validity failure:
(i.) A
(ii.) causes
(iii.) B
(iv.) in conditions C.
The four possible failures generate the core taxonomic structure.
Construct validity of the cause: Something might cause B in conditions C, but that something might not be A. A causal generalization has construct validity of the cause if the claim accurately specifies that A in particular (and not, for example, some other related thing) causes B in conditions C. Example of a failure of construct validity of the cause: Increased juror turnout among people who receive postcards might not be due to enforcement threats in particular but simply to being reminded of one's civic duty.
Construct validity of the effect: A might cause something in conditions C, but what it causes might not be B. A causal generalization has construct validity of the effect if the effect of A is accurately specified. A causes specifically B (and not, for example, some other related thing) in conditions C. Example of a failure of construct validity of the effect: Enforcement threats might increase the rates at which jurors who don't show up register a valid excuse without actually increasing turnout rates.
Generalizing: Construct validity is present in a causal generalization when the cause and effect are accurately specified.
External validity: A might cause B, but the conditions might not be correctly specified. A causal generalization has external validity if the claim accurately specifies the range of conditions in which it holds. Example of a failure: Enforcement messages might increase juror turnout not in the U.S. in general but only in low-income neighborhoods. Perfect external validity is probably an unattainable ideal for complex social and psychological processes, since the conditions in which causal generalizations hold will be complex and various.
Note on external validity: Common usage often holds that a claim is externally valid only if it holds across a wide range of contexts or conditions. However, this way of thinking unhelpfully denigrates perfectly accurate causal generalizations as "invalid" if they only hold, and are claimed only to hold, across a narrow range of conditions. Transportability is a better concept for characterizing breadth of applicability. An externally valid causal generalization that is accurately claimed to hold across only a narrow range of contexts is not transportable to those other contexts, but there is no inaccuracy or factual error in the statement "A causes B in conditions C" of the sort required for failure of validity. After all A does cause B in conditions C, just as claimed. So validity in the overarching sense described above is present.
Internal validity: A might be related to B in conditions C, but the relation might not be the directional causal relationship claimed. A causal generalization is internally valid if there is a cause-effect relationship of the type claimed (even if the cause, the effect, and/or the conditions are not accurately specified). Example of a failure: There's a common cause of both A and B, which are not directly causally related. Maybe having a stable address causes potential jurors both to be more likely to be sent the postcards and to be more likely to turn out.
Other types of validity can be understood within the general spirit of this framework.
Convergent validity: Present when two causes claimed to have the same effect in fact have the same effect. In common use, the causes are measures, for example two different measures of extraversion. In this case, A1 (application of the first measure) and A2 (application of the second measure) are claimed to have a common effect B (same normalized extraversion score) in a set of conditions often left unspecified. Convergent validity is present if that claim is true (or to the degree it is true).
Discriminant validity: Present when two causes claimed to have different effects in fact have different effects. A1 is claimed to cause B, and A2 is claimed not to cause B (in a set of conditions that is often left unspecified), and discriminant validity is present when that claim is true (or to the degree it is true). In practice, discriminant validity is often supported by observation of low correlations in appropriately controlled conditions. If A1 and A2 are psychological or social measures (e.g., personality measures of extraversion and openness), then a high correlation between the scores would suggest that there is some common psychological feature both measures are tracking, contrary to the ideal of general discriminant validity.
Predictive validity: Present when A is a common cause of B1 and B2, where B1 is typically the outcome of a measure and B2 is typically an event of practical import conceptually related but not closely physically related to B1. For example, application of a purported measure of recidivism (in this case, application of the measure isn't A but rather an intermediate event A1) among released prisoners has high predictive validity if high scores on the measure (B1) arise from the same cause or set of causes that generate high rates of recidivism (B2).
Note on predictive validity: A simpler characterization of "predictive validity" might be simply that B1 accurately predicts B2, but this isn't the most useful way to conceptualize the issue if the prediction is correct in virtue of B1 causing B2 rather than operating by a common cause. If my wife reliably picks me up from work when I ask, my asking (B1) predicts her picking me up (B2), but my asking does not have "predictive validity" in the intended measurement sense. A better term for this relationship would be casual power.
Face validity: Present when it is intuitively or theoretically plausible that A causes B in conditions C. Notably, face validity needn't require that A in fact causes B in conditions C.
Ecological validity: A type of external validity that emphasizes the importance of generalizing correctly over real-world settings (as opposed to laboratory settings or other artificial settings).
Content validity: A type of construct validity focused on whether the content of a complex measure accurately reflects all aspects of the target measured.
Criterion validity: Present when a measure or intervention satisfies some prespecified criterion of success, regardless of whether the measure or intervention in fact measures what it purports to measure.
Finally, two types of validity where "validity" is a property of the inference rather than in terms of the truth of some part of a causal claim:
Statistical conclusion validity: Present when statistics are appropriately used, regardless of whether A in fact causes B in conditions C.
Logical validity: Present when the conclusion of an argument can't be false if its premises are true.