EconPapers    
Economics at your fingertips  
 

Explainable machine learning model for predicting cesarean section following induction of labor: Development and external validation using real-world data

Yanan Hu, Xin Zhang, Valerie Slavin, Joanne Enticott and Emily Callander

PLOS Digital Health, 2025, vol. 4, issue 11, 1-12

Abstract: Induction of labor (IOL) is a common yet complex clinical procedure associated with varying risks, including cesarean section (CS). Accurate prediction models may help support more informed, personalized decision-making. This study aimed to develop and validate an explainable machine learning prediction model for CS following IOL. We used population-based administrative perinatal datasets from two Australian states (New South Wales (NSW) and Queensland) covering all births between 2016 and 2019 for model development. Temporal validation was conducted using 2020 births from NSW, and geographical validation using 2016–2018 births from Victoria. We included women with singleton, cephalic, term, live births who attempted IOL and had no prior CS. Seven models (logistic regression, random forest, gradient boosting, LightGBM, XGBoost, CatBoost, and AdaBoost) were developed with hyperparameter tuning and feature selection. Performance was assessed using the area under the receiver operating characteristic curve (AUROC), area under the precision-recall curve, calibration plot (overall and across sociodemographic subgroups), decision curve analysis, Brier Score, and model parsimony. SHAP (SHapley Additive exPlanations) values were used to explain predictor contributions. A total of 180,700 women were included in model development (mean age 31 ± 5 years; CS = 20.8%). The optimal model, developed using XGBoost with ten predictors, achieved AUROCs of 0.76 (95% CI: 0.75–0.77) and 0.75 (95% CI: 0.74–0.76) in temporal (n = 14,527; CS = 22.5%) and geographical (n = 14,755; CS = 19.0%) validations, respectively. The most influential predictors were nulliparity, pre-pregnancy body mass index, and maternal age, while diabetes and hypertension (pre-existing or pregnancy-related) contributed least. Women with higher predicted CS probabilities had increased inpatient costs and maternal morbidity, regardless of actual mode of birth. The final model is accessible via an interactive web application (https://csai-8ccf2690242c.herokuapp.com/). This model demonstrates strong predictive performance using routinely collected maternal factors. Further co-design and implementation research is needed before potential clinical adoption.Author summary: An increasing number of pregnant women are having their labor induced. One key concern in this process is the potential need for a cesarean section. To support more personalized and informed decision-making, our study focuses on the first essential step: developing and validating machine learning models that predict the likelihood of cesarean section using routinely collected information available before induction. Leveraging large-scale, real-world data from Australian maternity care, our final model demonstrated strong predictive performance and was designed to be transparent and explainable. We have deployed the best-performing model as a publicly accessible, user-friendly web application (https://csai-8ccf2690242c.herokuapp.com/). This tool provides an individualized prediction and establishes a foundation for future research on clinical implementation, user experience, and real-world impact. Ultimately, it may help guide early treatments, reduce unnecessary obstetric interventions, and improve the efficiency of healthcare resource use.

Date: 2025
References: View complete reference list from CitEc
Citations:

Downloads: (external link)
https://journals.plos.org/digitalhealth/article?id=10.1371/journal.pdig.0001061 (text/html)
https://journals.plos.org/digitalhealth/article/fi ... 01061&type=printable (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:plo:pdig00:0001061

DOI: 10.1371/journal.pdig.0001061

Access Statistics for this article

More articles in PLOS Digital Health from Public Library of Science
Bibliographic data for series maintained by digitalhealth ().

 
Page updated 2025-11-29
Handle: RePEc:plo:pdig00:0001061