Geospatial variation and machine learning approaches to predict open defecation practice in Zambia
Andualem Addisu Birlie,
Nega Abebe Meshesha,
Sefefe Birhanu Tizie,
Smegnew Gichew Wondie,
Selamawit Gashaw Teshome,
Tesfaye Deribe Bedada,
Ayana Alebachew Muluneh,
Biruktawit Lelisa Eticha,
Geleta Nenko Dube,
Gelgelo Wodessa and
Muluken Belachew Mengistie
PLOS ONE, 2026, vol. 21, issue 6, 1-23
Abstract:
Background: Open defecation is the disposal of human feces in fields, bushes, forests, open waterways, beaches, and other open areas. It worsens the environment, contaminates drinking water sources, causes malnutrition and low school attendance in children, and aids in the spread of diseases like cholera, diarrhea, dysentery, typhoid, polio, and hepatitis A. The purpose of this study was geospatial variation and machine learning approaches to predict open defecation in Zambia. Methods: This study used secondary data analysis from the cross-sectional Zambia Demographic and Health Survey (ZMDHS) 2024. Spatial distribution, spatial autocorrelation, incremental autocorrelation, spatial interpolation, and hot spot area detection were all examined using ArcGIS 10.7. Python was used to identify the features of open defecation practice using machine-learning algorithms. We carried out an 80/20% data split, one-hot data encoding, data transformation and integration, data cleaning, and 10-fold stratified cross-validation. This study employed seven machine-learning algorithms, including adaptive boosting, cat boosting, random forest, light boosting, extreme gradient boosting, decision tree and logistic regression. Results: Among 12,808 households in Zambia, 12.1% were practiced open defecation. Spatial analysis revealed significant clustering, with hot spots concentrated in Southern, Western and Eastern regions, highlighting areas in urgent need of intervention. Machine learning models were applied to predict open defecation practices, with light gradient boost performing the best model with AUC of 83.83%. From this study, 198 true positives were generated by the model for the classification of open defecation practice, accurately identifying those who reported engaging in this behavior. Conclusion: Wealth index, access to treated water, access to electricity, educational level, and age of household head, access to media, and region were the most significant features of open defecation practice. Governments, NGOs, policy makers, and researchers can use these data to create targeted interventions for improving health and environmental sanitation based on the gaps and disparities discovered.
Date: 2026
References: Add references at CitEc
Citations:
Downloads: (external link)
https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0350923 (text/html)
https://journals.plos.org/plosone/article/file?id= ... 50923&type=printable (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:plo:pone00:0350923
DOI: 10.1371/journal.pone.0350923
Access Statistics for this article
More articles in PLOS ONE from Public Library of Science
Bibliographic data for series maintained by plosone ().