EconPapers    
Economics at your fingertips  
 

Predicting Lung Cancer in the United States: A Multiple Model Examination of Public Health Factors

Arnold Kamis, Rui Cao, Yifan He, Yuan Tian and Chuyue Wu
Additional contact information
Arnold Kamis: International Business School, Brandeis University, 415 South Street, Waltham, MA 02454-9110, USA
Rui Cao: International Business School, Brandeis University, 415 South Street, Waltham, MA 02454-9110, USA
Yifan He: International Business School, Brandeis University, 415 South Street, Waltham, MA 02454-9110, USA
Yuan Tian: International Business School, Brandeis University, 415 South Street, Waltham, MA 02454-9110, USA
Chuyue Wu: International Business School, Brandeis University, 415 South Street, Waltham, MA 02454-9110, USA

IJERPH, 2021, vol. 18, issue 11, 1-27

Abstract: In this research, we take a multivariate, multi-method approach to predicting the incidence of lung cancer in the United States. We obtain public health and ambient emission data from multiple sources in 2000–2013 to model lung cancer in the period 2013–2017. We compare several models using four sources of predictor variables: adult smoking, state, environmental quality index, and ambient emissions. The environmental quality index variables pertain to macro-level domains: air, land, water, socio-demographic, and built environment. The ambient emissions consist of Cyanide compounds, Carbon Monoxide, Carbon Disulfide, Diesel Exhaust, Nitrogen Dioxide, Tropospheric Ozone, Coarse Particulate Matter, Fine Particulate Matter, and Sulfur Dioxide. We compare various models and find that the best regression model has variance explained of 62 percent whereas the best machine learning model has 64 percent variance explained with 10% less error. The most hazardous ambient emissions are Coarse Particulate Matter, Fine Particulate Matter, Sulfur Dioxide, Carbon Monoxide, and Tropospheric Ozone. These ambient emissions could be curtailed to improve air quality, thus reducing the incidence of lung cancer. We interpret and discuss the implications of the model results, including the tradeoff between transparency and accuracy. We also review limitations of and directions for the current models in order to extend and refine them.

Keywords: adult smoking; lung cancer; united states; regression; environmental quality index; ambient emissions; machine learning; transparency; iterative modeling (search for similar items in EconPapers)
JEL-codes: I I1 I3 Q Q5 (search for similar items in EconPapers)
Date: 2021
References: View references in EconPapers View complete reference list from CitEc
Citations:

Downloads: (external link)
https://www.mdpi.com/1660-4601/18/11/6127/pdf (application/pdf)
https://www.mdpi.com/1660-4601/18/11/6127/ (text/html)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:gam:jijerp:v:18:y:2021:i:11:p:6127-:d:569917

Access Statistics for this article

IJERPH is currently edited by Ms. Jenna Liu

More articles in IJERPH from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().

 
Page updated 2025-03-19
Handle: RePEc:gam:jijerp:v:18:y:2021:i:11:p:6127-:d:569917