Racial disparities in continuous glucose monitoring-based 60-min glucose predictions among people with type 1 diabetes
Helene Bei Thomsen,
Livie Yumeng Li,
Anders Aasted Isaksen,
Benjamin Lebiecka-Johansen,
Charline Bour,
Guy Fagherazzi,
William P T M van Doorn,
Tibor V Varga and
Adam Hulman
PLOS Digital Health, 2025, vol. 4, issue 6, 1-13
Abstract:
Non-Hispanic white (White) populations are overrepresented in medical studies. Potential healthcare disparities can happen when machine learning models, used in diabetes technologies, are trained on data from primarily White patients. We aimed to evaluate algorithmic fairness in glucose predictions. This study utilized continuous glucose monitoring (CGM) data from 101 White and 104 Black participants with type 1 diabetes collected by the JAEB Center for Health Research, US. Long short-term memory (LSTM) deep learning models were trained on 11 datasets of different proportions of White and Black participants and tailored to each individual using transfer learning to predict glucose 60 minutes ahead based on 60-minute windows. Root mean squared errors (RMSE) were calculated for each participant. Linear mixed-effect models were used to investigate the association between racial composition and RMSE while accounting for age, sex, and training data size. A median of 9 weeks (IQR: 7, 10) of CGM data was available per participant. The divergence in performance (RMSE slope by proportion) was not statistically significant for either group. However, the slope difference (from 0% White and 100% Black to 100% White and 0% Black) between groups was statistically significant (p = 0.02), meaning the RMSE increased 0.04 [0.01, 0.08] mmol/L more for Black participants compared to White participants when the proportion of White participants increased from 0 to 100% in the training data. This difference was attenuated in the transfer learned models (RMSE: 0.02 [-0.01, 0.05] mmol/L, p = 0.20). The racial composition of training data created a small statistically significant difference in the performance of the models, which was not present after using transfer learning. This demonstrates the importance of diversity in datasets and the potential value of transfer learning for developing more fair prediction models.Author summary: Non-Hispanic White populations are often overrepresented in medical datasets. Training machine learning models on such data may lead to unfair clinical prediction tools and an unfavorable impact on healthcare inequalities. This study investigated how well machine learning models perform in predicting blood sugar levels for Non-Hispanic White and Non-Hispanic Black people with type 1 diabetes. We used continuous glucose monitoring (CGM) data from people with type 1 diabetes living in the US to compare various methods and models trained on datasets with different proportions of White and Black participants. We found a difference between the performance improvement in White and the performance drop in Black participants as the proportion of White participants increased in the dataset used for training. This difference disappeared when models were further tailored to individuals. Our work demonstrates the importance of using diverse training data when developing AI-based solutions for healthcare.
Date: 2025
References: Add references at CitEc
Citations:
Downloads: (external link)
https://journals.plos.org/digitalhealth/article?id=10.1371/journal.pdig.0000918 (text/html)
https://journals.plos.org/digitalhealth/article/fi ... 00918&type=printable (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:plo:pdig00:0000918
DOI: 10.1371/journal.pdig.0000918
Access Statistics for this article
More articles in PLOS Digital Health from Public Library of Science
Bibliographic data for series maintained by digitalhealth ().