Now You See Me: High School Dropout and Machine Learning
Dario Sansone
2017 Stata Conference from Stata Users Group
Abstract:
In this paper, we create an algorithm to predict which students are eventually going to drop out of US high school using information available in 9th grade. We show that using a naive model - as implemented in many schools - leads to poor predictions. In addition to this, we explain how schools can obtain more precise predictions by exploiting the big data available to them, as well as more sophisticated quantitative techniques. We also compare the performances of econometric techniques like Logistic Regression with Machine Learning tools such as Support Vector Machine, Boosting and LASSO. We offer practical advice on how to apply the new Machine Learning codes available in Stata to the high dimensional datasets available in education. Model parameters are calibrated by taking into account policy goals and budget constraints.
Date: 2017-08-10
New Economics Papers: this item is included in nep-big, nep-cmp and nep-edu
References: Add references at CitEc
Citations:
Downloads: (external link)
http://fmwww.bc.edu/repec/scon2017/Baltimore17_Sansone.pdf
Related works:
Working Paper: Beyond Early Warning Indicators: High School Dropout and Machine Learning (2017) 
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:boc:scon17:5
Access Statistics for this paper
More papers in 2017 Stata Conference from Stata Users Group Contact information at EDIRC.
Bibliographic data for series maintained by Christopher F Baum ().