EconPapers    
Economics at your fingertips  
 

Assessing the value of integrating national longitudinal shopping data into respiratory disease forecasting models

Elizabeth Dolan (), James Goulding, Harry Marshall, Gavin Smith, Gavin Long and Laila J. Tata
Additional contact information
Elizabeth Dolan: N/LAB, Nottingham University Business School, University of Nottingham
James Goulding: N/LAB, Nottingham University Business School, University of Nottingham
Harry Marshall: N/LAB, Nottingham University Business School, University of Nottingham
Gavin Smith: N/LAB, Nottingham University Business School, University of Nottingham
Gavin Long: N/LAB, Nottingham University Business School, University of Nottingham
Laila J. Tata: Lifespan and Population Health, School of Medicine, University of Nottingham

Nature Communications, 2023, vol. 14, issue 1, 1-19

Abstract: Abstract The COVID-19 pandemic led to unparalleled pressure on healthcare services. Improved healthcare planning in relation to diseases affecting the respiratory system has consequently become a key concern. We investigated the value of integrating sales of non-prescription medications commonly bought for managing respiratory symptoms, to improve forecasting of weekly registered deaths from respiratory disease at local levels across England, by using over 2 billion transactions logged by a UK high street retailer from March 2016 to March 2020. We report the results from the novel AI (Artificial Intelligence) explainability variable importance tool Model Class Reliance implemented on the PADRUS model (Prediction of Amount of Deaths by Respiratory disease Using Sales). PADRUS is a machine learning model optimised to predict registered deaths from respiratory disease in 314 local authority areas across England through the integration of shopping sales data and focused on purchases of non-prescription medications. We found strong evidence that models incorporating sales data significantly out-perform other models that solely use variables traditionally associated with respiratory disease (e.g. sociodemographics and weather data). Accuracy gains are highest (increases in R2 (coefficient of determination) between 0.09 to 0.11) in periods of maximum risk to the general public. Results demonstrate the potential to utilise sales data to monitor population health with information at a high level of geographic granularity.

Date: 2023
References: View references in EconPapers View complete reference list from CitEc
Citations:

Downloads: (external link)
https://www.nature.com/articles/s41467-023-42776-4 Abstract (text/html)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:nat:natcom:v:14:y:2023:i:1:d:10.1038_s41467-023-42776-4

Ordering information: This journal article can be ordered from
https://www.nature.com/ncomms/

DOI: 10.1038/s41467-023-42776-4

Access Statistics for this article

Nature Communications is currently edited by Nathalie Le Bot, Enda Bergin and Fiona Gillespie

More articles in Nature Communications from Nature
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().

 
Page updated 2025-03-19
Handle: RePEc:nat:natcom:v:14:y:2023:i:1:d:10.1038_s41467-023-42776-4