Using machine learning and single nucleotide polymorphisms for improving rheumatoid arthritis risk Prediction in postmenopausal women
Yingke Xu and
Qing Wu
PLOS Digital Health, 2025, vol. 4, issue 4, 1-13
Abstract:
Genetic factors contribute to 60-70% of the variability in rheumatoid arthritis (RA). However, few studies have used genetic variants to predict RA risk. This study aimed to enhance RA risk prediction by leveraging single nucleotide polymorphisms (SNPs) through machine-learning algorithms, utilizing Women’s Health Initiative data. We developed four predictive models: 1) based on common RA risk factors, 2) model 1 incorporating polygenic risk scores (PRS) with principal components, 3) model 1 and SNPs after feature reduction, and 4) model 1 and SNPs with kernel principal component analysis. Each model was assessed using logistic regression (LR), random forest (RF), eXtreme Gradient Boosting (XGBoost), and support vector machine (SVM). Performance metrics included the area under the receiver operating characteristic curve (AUC), sensitivity, specificity, positive and negative predictive values (PPV and NPV), and F1-score. The fourth model, integrating SNPs with XGBoost, outperformed all other models. In addition, the XGBoost model that combines genomic data with conventional phenotypic predictors significantly enhanced predictive accuracy, achieving the highest AUC of 0.90 and an F1 score of 0.83. The DeLong test confirmed significant differences in AUC between this model and the others (p-values
Date: 2025
References: View complete reference list from CitEc
Citations:
Downloads: (external link)
https://journals.plos.org/digitalhealth/article?id=10.1371/journal.pdig.0000790 (text/html)
https://journals.plos.org/digitalhealth/article/fi ... 00790&type=printable (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:plo:pdig00:0000790
DOI: 10.1371/journal.pdig.0000790
Access Statistics for this article
More articles in PLOS Digital Health from Public Library of Science
Bibliographic data for series maintained by digitalhealth ().