Identifying determinants and predicting cesarean section delivery among Bangladeshi women using machine learning: Insight from BDHS 2022 Data
Shamsuz Zoha,
Shahin Alam,
Isteaq Kabir Sifat,
Nourin Sultana and
Md Kaderi Kibria
PLOS Global Public Health, 2025, vol. 5, issue 11, 1-18
Abstract:
Cesarean section (C-section) rates have been rising globally, posing potential health risks for mothers and infants. Understanding the factors that contribute to C-section delivery and leveraging machine learning (ML) techniques for predictive modeling can support targeted interventions and informed policy decisions. This study aimed to identify the determinants of C-section delivery and develop an ML-based predictive model using data from the Bangladesh Demographic and Health Survey (BDHS) 2022. A total of 2,490 complete records of ever-married women aged 15–49 years were analyzed, where the delivery mode was categorized as vaginal or C-section. Three feature selection techniques including Recursive Feature Elimination (RFE), Boruta-based selection (BFS), and Random Forest (RF) were used to identify key risk factors. Six ML algorithms, including Logistic Regression (LR), Support Vector Machine (SVM), K-Nearest Neighbors (KNN), Decision Tree (DT), Random Forest (RF), and Extreme Gradient Boosting (XGB) were employed to predict C-section. Model performance was evaluated using accuracy, precision, recall, F1-score, AUC, and ROC analysis. SHAP values were used to interpret the influence of individual features.The prevalence of C-section deliveries was 45.6%, with an average maternal age of 25.7 years and mean age at first childbirth of 19.3 years. Ten significant determinants were identified, including place of delivery, baby weight, maternal BMI, birth interval, age at first birth, partner’s education, maternal age, wealth status, ANC visits, and maternal education. The RF model achieved the highest performance with an accuracy of 81.79%, and an AUC of 0.871. SHAP analysis highlighted that place of delivery, baby weight, maternal BMI, and birth interval were the most influential predictors. These findings suggest that socio-demographic and healthcare-related factors strongly influence C-section delivery. Machine learning models particularly the RF can effectively identify women at high risk, supporting strategies to reduce unnecessary C-sections and improve maternal healthcare planning in Bangladesh.
Date: 2025
References: View complete reference list from CitEc
Citations:
Downloads: (external link)
https://journals.plos.org/globalpublichealth/artic ... journal.pgph.0005494 (text/html)
https://journals.plos.org/globalpublichealth/artic ... 05494&type=printable (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:plo:pgph00:0005494
DOI: 10.1371/journal.pgph.0005494
Access Statistics for this article
More articles in PLOS Global Public Health from Public Library of Science
Bibliographic data for series maintained by globalpubhealth ().