Machine learning prediction of metabolic-associated fatty liver disease in type 2 diabetes: Emphasizing data imputation and feature selection
Zahra Khosravi,
Farnaz Barzinpour,
Soghra Rabizadeh,
Manouchehr Nakhjavani and
Alireza Esteghamati
PLOS ONE, 2026, vol. 21, issue 2, 1-19
Abstract:
Metabolic-Associated Fatty Liver Disease (MAFLD) is common among Type 2 Diabetes (T2DM) patients. The coexistence of these conditions increases the risk of MAFLD progression and diabetes complications. Detecting MAFLD early is challenging due to its asymptomatic initial stages. In this study, we aimed to develop a machine learning model to predict MAFLD in T2DM patients. We conducted a cross-sectional study on 3,654 Iranian T2DM patients using their demographic and lab data. This study involved thorough data preprocessing, including evaluating various imputation methods on simulated missingness in a complete subset of the dataset. Additionally, four feature selection methods were applied to eight machine learning models to identify the most effective predictive model. The XGBoost classifier without feature selection achieved the best performance, with an accuracy of 80.6% and an area under the receiver operating characteristic curve (AUC) of 88.9%.Notably, certain features, such as alanine aminotransferase (ALT), platelet count (PLT) and Vitamin D(VitD) influenced the predictive performance.
Date: 2026
References: Add references at CitEc
Citations:
Downloads: (external link)
https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0339580 (text/html)
https://journals.plos.org/plosone/article/file?id= ... 39580&type=printable (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:plo:pone00:0339580
DOI: 10.1371/journal.pone.0339580
Access Statistics for this article
More articles in PLOS ONE from Public Library of Science
Bibliographic data for series maintained by plosone ().