Building Better Models
Matthew Hindman
The ANNALS of the American Academy of Political and Social Science, 2015, vol. 659, issue 1, 48-62
Abstract:
Analytic techniques developed for big data have much broader applications in the social sciences, outperforming standard regression models even—or rather especially—in smaller datasets. This article offers an overview of machine learning methods well-suited to social science problems, including decision trees, dimension reduction methods, nearest neighbor algorithms, support vector models, and penalized regression. In addition to novel algorithms, machine learning places great emphasis on model checking (through holdout samples and cross-validation) and model shrinkage (adjusting predictions toward the mean to reduce overfitting). This article advocates replacing typical regression analyses with two different sorts of models used in concert. A multi-algorithm ensemble approach should be used to determine the noise floor of a given dataset, while simpler methods such as penalized regression or decision trees should be used for theory building and hypothesis testing.
Keywords: big data; machine learning; predictive modeling; data science; penalized regression; ensemble learning; the Lasso (search for similar items in EconPapers)
Date: 2015
References: View references in EconPapers View complete reference list from CitEc
Citations:
Downloads: (external link)
https://journals.sagepub.com/doi/10.1177/0002716215570279 (text/html)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:sae:anname:v:659:y:2015:i:1:p:48-62
DOI: 10.1177/0002716215570279
Access Statistics for this article
More articles in The ANNALS of the American Academy of Political and Social Science
Bibliographic data for series maintained by SAGE Publications ().