EconPapers    
Economics at your fingertips  
 

Yule–Simpson’s paradox: the probabilistic versus the empirical conundrum

Aris Spanos

Statistical Methods & Applications, 2021, vol. 30, issue 2, No 9, 605-635

Abstract: Abstract The current literature views Simpson’s paradox as a probabilistic conundrum by taking the premises (probabilities/parameters/ frequencies) as known. In such a context, it is shown that the paradox arises within a very small subset of the relevant parameter space, rendering the paradox unlikely to occur in real data. The problem, however, is that the probabilistic perspective, ignores certain crucial empirical (data, statistical) issues raised by the original Pearson and Yule papers on ‘spurious’ association reversals. Placing the paradox in a broader empirical framework that begins with the raw data $${\mathbf {z}}_{0}$$ z 0 and an appropriately selected statistical model $${\mathcal {M}}_{{\varvec{{\theta }}}}({\mathbf {x}})$$ M θ ( x ) , the discussion elucidates the original Yule–Pearson conundrum by formalizing its notion of ‘spurious or fictitious associations’ into ‘statistically untrustworthy associations’ stemming from a misspecified $${\mathcal {M}}_{{\varvec{{\theta }}}}( {\mathbf {x}})$$ M θ ( x ) ; invalid probabilistic assumptions imposed on $${\mathbf {z}}_{0}$$ z 0 . It is shown that several empirical examples used to illustrate Simpson’s paradox in the current literature constitute examples of the Yule–Pearson untrustworthy association reversals. The empirical perspective is used to revisit the causal explanation of the paradox and make a case that several widely accepted causal claims are questionable on statistical adequacy grounds. It is also used to propose a procedure to detect and account for the ‘third entity’ in the paradox, as well as (reliably) select among different potential causal explanations, such as collider, mediator or confounder, on empirical grounds.

Keywords: Simpson’s paradox; Association reversal; Spurious correlation; Third entity; Probabilistic versus empirical paradoxes; Statistical misspecification; Statistical versus substantive adequacy; Misspecification testing; Graphical causal models; DAG models; Untrustworthy evidence; Confounding; Collider; Mediator (search for similar items in EconPapers)
Date: 2021
References: View references in EconPapers View complete reference list from CitEc
Citations:

Downloads: (external link)
http://link.springer.com/10.1007/s10260-020-00536-4 Abstract (text/html)
Access to the full text of the articles in this series is restricted.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:spr:stmapp:v:30:y:2021:i:2:d:10.1007_s10260-020-00536-4

Ordering information: This journal article can be ordered from
http://www.springer. ... cs/journal/10260/PS2

DOI: 10.1007/s10260-020-00536-4

Access Statistics for this article

Statistical Methods & Applications is currently edited by Tommaso Proietti

More articles in Statistical Methods & Applications from Springer, Società Italiana di Statistica
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().

 
Page updated 2025-03-31
Handle: RePEc:spr:stmapp:v:30:y:2021:i:2:d:10.1007_s10260-020-00536-4