Table 2 Fallacy in Descriptive Epidemiology: Bringing Machine Learning to the Table
Christoffer Dharma,
Rui Fu and
Michael Chaiton ()
Additional contact information
Christoffer Dharma: Dalla Lana School of Public Health, University of Toronto, Toronton, ON M5T 3M7, Canada
Rui Fu: Dalla Lana School of Public Health, University of Toronto, Toronton, ON M5T 3M7, Canada
Michael Chaiton: Dalla Lana School of Public Health, University of Toronto, Toronton, ON M5T 3M7, Canada
IJERPH, 2023, vol. 20, issue 13, 1-12
Abstract:
There is a lack of rigorous methodological development for descriptive epidemiology, where the goal is to describe and identify the most important associations with an outcome given a large set of potential predictors. This has often led to the Table 2 fallacy, where one presents the coefficient estimates for all covariates from a single multivariable regression model, which are often uninterpretable in a descriptive analysis. We argue that machine learning (ML) is a potential solution to this problem. We illustrate the power of ML with an example analysis identifying the most important predictors of alcohol abuse among sexual minority youth. The framework we propose for this analysis is as follows: (1) Identify a few ML methods for the analysis, (2) optimize the parameters using the whole data with a nested cross-validation approach, (3) rank the variables using variable importance scores, (4) present partial dependence plots (PDP) to illustrate the association between the important variables and the outcome, (5) and identify the strength of the interaction terms using the PDPs. We discuss the potential strengths and weaknesses of using ML methods for descriptive analysis and future directions for research. R codes to reproduce these analyses are provided, which we invite other researchers to use.
Keywords: machine learning; descriptive analysis; data analysis methods; alcohol use; sexual minority youth (search for similar items in EconPapers)
JEL-codes: I I1 I3 Q Q5 (search for similar items in EconPapers)
Date: 2023
References: View references in EconPapers View complete reference list from CitEc
Citations:
Downloads: (external link)
https://www.mdpi.com/1660-4601/20/13/6194/pdf (application/pdf)
https://www.mdpi.com/1660-4601/20/13/6194/ (text/html)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:gam:jijerp:v:20:y:2023:i:13:p:6194-:d:1175932
Access Statistics for this article
IJERPH is currently edited by Ms. Jenna Liu
More articles in IJERPH from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().