EconPapers    
Economics at your fingertips  
 

Assessing in-hospital mortality risk in ICU lung cancer patients using machine learning: An analysis based on the MIMIC-IV database

Jianwei Wang, Lizhen Lin, Li-ping Qiu, Li-lan Zheng, Lu-xi Wu, Hui Lv and Haihua Xie

PLOS ONE, 2026, vol. 21, issue 1, 1-16

Abstract: Background: Patients with advanced lung cancer admitted to the intensive care unit (ICU) face a substantially elevated risk of in-hospital mortality. Early identification of high-risk individuals is essential to support timely clinical decision-making. This study aimed to develop and validate a predictive model using machine learning (ML) techniques to estimate in-hospital mortality in this patient population. Methods: Clinical data were obtained from the Medical Information Mart for Intensive Care-IV (MIMIC-IV) database. Feature selection was performed using least absolute shrinkage and selection operator (LASSO) regression, enabling the construction of eight ML models: logistic regression (LR), support vector machine (SVM), gradient boosting machine (GBM), artificial neural network (ANN), extreme gradient boosting (XGBoost), k-nearest neighbors (k-NN), adaptive boosting (AdaBoost), and random forest (RF). Model performance was assessed using the area under the receiver operating characteristic curve (AUC), as well as accuracy, sensitivity, specificity, and F1 score. Discrimination, calibration, and clinical utility were also evaluated. The final model incorporated 27 clinically interpretable variables, including not only established severity scores (e.g., SAPS II) but also dynamic treatment factors (e.g., vasopressin, mechanical ventilation duration) that reflect real-world ICU practice. SHAP analysis was employed to enhance interpretability, allowing clinicians to understand both the magnitude and directionality of key predictors—an improvement over black-box ML applications in prior studies. Results: Among the 1,755 patients included, 368 (21%) died during hospitalization in the training cohort.Notably, older individuals, particularly those of Caucasian descent, demonstrated a higher susceptibility to mortality during their hospital stay. Lasso regression revealed that 27 variables demonstrated a significant correlation with lung cancer, such as gender, hospital stay duration The XGBoost model achieved the highest predictive performance, achieving an accuracy of 0.783, an F1 score of 0.595, and an AUC of 0.865 (95% CI: 0.840–0.891)within the training cohort. The performance metrics for the test cohort reflected similar trends, with an accuracy of 0.719, an F1 score of 0.543, and an AUC of 0.790(95% CI: 0.741–0.840). Key predictors identified consistently across models (LR, SVM, ANN, and XGBoost) included hospital stay duration, Simplified Acute Physiology Score II (SAPS II), use of norepinephrine and vasopressin, prothrombin time (PT), mechanical ventilation duration, white blood cell count (WBC), and blood urea nitrogen (BUN). The SHAP summary plot further illustrated the direction and magnitude of influence for the top 15 predictors. Conclusion: The XGBoost-based model showed the best performance in predicting in-hospital mortality among critically ill lung cancer patients. Hospital stay duration and SAPS II score emerged as the most influential predictors,which can serve as the basis for a simplified clinical risk score. These findings may support early risk stratification and guide clinical decision-making in the ICU. The analysis, relying exclusively on internal divisions from MIMIC-IV, restricts the model’s generalizability and, consequently, its applicability in broader clinical contexts.

Date: 2026
References: View complete reference list from CitEc
Citations:

Downloads: (external link)
https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0341259 (text/html)
https://journals.plos.org/plosone/article/file?id= ... 41259&type=printable (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:plo:pone00:0341259

DOI: 10.1371/journal.pone.0341259

Access Statistics for this article

More articles in PLOS ONE from Public Library of Science
Bibliographic data for series maintained by plosone ().

 
Page updated 2026-01-31
Handle: RePEc:plo:pone00:0341259