EconPapers    
Economics at your fingertips  
 

Primary prevention cardiovascular disease risk prediction model for contemporary Chinese (1°P-CARDIAC): Model derivation and validation using a hybrid statistical and machine-learning approach

Yekai Zhou, Celia Jiaxi Lin, Qiuyan Yu, Joseph Edgar Blais, Eric Yuk Fai Wan, Emmanuel Wong, Kathryn Tan, David Chung-Wah Siu, Kai Hang Yiu, Esther Wai Yin Chan, Doris Yu, William Wong, Tak-Wah Lam, Ian Chi Kei Wong, Ruibang Luo and Celine S L Chui

PLOS ONE, 2025, vol. 20, issue 7, 1-15

Abstract: Background: Cardiovascular disease (CVD) is the leading cause of mortality and morbidity in China and worldwide while we are lacking in validated primary prevention model specifically for Chinese. To identify CVD high-risk individuals for early intervention, we created and validated a primary prevention risk prediction model, Personalized CARdiovascular DIsease risk Assessment for Chinese (1°P-CARDIAC), in contemporary Chinese cohorts in Hong Kong. Methods: Patients without any history of CVD was categorized as derivation and validation cohorts based on their different geographical location of residence in Hong Kong. The outcome was the first diagnosis of a composite of coronary heart disease, ischemic or hemorrhagic stroke, peripheral artery disease, and revascularization. The full model incorporated all available variables in the dataset as clinical laboratory tests, disease and medication history, family history of disease, demographic factors, and healthcare utilization. We employed XGBoost Cox model and multivariate imputation with chained equation (MICE) for derivation and missing data replacement. A basic model was developed with the integration of statistically significant and important subset of risk variables by least absolute shrinkage and selection operator (LASSO) regression. Validation was performed by 1000 bootstrap replicates and compared to four existing models: PREDICT, pooled cohort equation (PCE), China-PAR, and Framingham (Asian). Results: The study included 179,953 patients in the derivation cohort and 1,083,924 patients across two independent validation cohorts. A total of 103 covariates were included in the full model whilst 8 covariates were included the basic model. It demonstrated good performance with C-statistic of 0.87 (95% CI: 0.87, 0.87), calibration slope of 0.94 in the full model. The C-statistic in the basic model was 0.75 (95% CI: 0.75, 0.75) with calibration slope of 0.91. Other comparison risk models have lower C statistic ranging from 0.68 to 0.72. Conclusion: We developed and validated 1°P-CARDIAC, a CVD risk prediction model for primary prevention applying a novel hybrid statistical and machine-learning approach. Validation results suggest that it may offer improved performance compared to commonly used risk models. The 1°P-CARDIAC yields the similar level of accuracy and performance between basic and full model. It demonstrated both effectiveness and versatility in harnessing the power of big data and which has the potential to serve as a promising method for CVD primary prevention and improving public health outcome.

Date: 2025
References: Add references at CitEc
Citations:

Downloads: (external link)
https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0322419 (text/html)
https://journals.plos.org/plosone/article/file?id= ... 22419&type=printable (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:plo:pone00:0322419

DOI: 10.1371/journal.pone.0322419

Access Statistics for this article

More articles in PLOS ONE from Public Library of Science
Bibliographic data for series maintained by plosone ().

 
Page updated 2025-08-02
Handle: RePEc:plo:pone00:0322419