A framework to predict second primary lung cancer patients by using ensemble models
Yen-Chun Huang,
Chieh-Wen Ho,
Wen-Ru Chou and
Mingchih Chen ()
Additional contact information
Yen-Chun Huang: Tamkang University
Chieh-Wen Ho: Department of Biology, Texas A&M University
Wen-Ru Chou: Fu Jen Catholic University
Mingchih Chen: Fu Jen Catholic University
Annals of Operations Research, 2025, vol. 348, issue 1, No 16, 373-397
Abstract:
Abstract Machine learning (ML) model prediction, which has been wildly used in healthcare industry recently, serves as a tool to help users to make quick decisions. The prediction results could improve treatment outcomes and reduce the medical expenses. This research proposed the ML-based decision tool to predict the second primary lung cancer probability within lung cancer patients. This tool included following stages: The first stage is data processing to select the target patients by using National Health Insurance Research Database from 2011 to 2016 period as study. The second stage has used synthetic minority oversampling technique (SMOTE) to make data balancing. The third stage is feature selecting, and in final stage, we have applied five ML algorithms, which is included: Logistic Regression (LGR), Decision Tree, Random Forests (RF), multivariate adaptive regression splines (MARS), and extreme gradient boosting (XGBoost) with optimal features, then followed by building ensemble models. The results show that after feature selection, the ensemble models yield an accuracy rate 0.932. Different types of therapy (Chemotherapy (CH); Radiotherapy (RT), tyrosine kinase inhibitor (TKI)), different clinical stages, and Epidermal Growth Factor Receptor (EGFR) states were the top five optimal features affecting developed second primary lung cancer. This study can help physicians to identify the possibility with second primary lung cancer patients and make complete treatment plans for them.
Keywords: SPLC; Lung cancer; NHIRD; Feature selection; SMOTE; Ensemble models (search for similar items in EconPapers)
Date: 2025
References: View references in EconPapers View complete reference list from CitEc
Citations:
Downloads: (external link)
http://link.springer.com/10.1007/s10479-023-05691-x Abstract (text/html)
Access to the full text of the articles in this series is restricted.
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:spr:annopr:v:348:y:2025:i:1:d:10.1007_s10479-023-05691-x
Ordering information: This journal article can be ordered from
http://www.springer.com/journal/10479
DOI: 10.1007/s10479-023-05691-x
Access Statistics for this article
Annals of Operations Research is currently edited by Endre Boros
More articles in Annals of Operations Research from Springer
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().