EconPapers    
Economics at your fingertips  
 

Enhancing the prediction of acute kidney injury risk after percutaneous coronary intervention using machine learning techniques: A retrospective cohort study

Chenxi Huang, Karthik Murugiah, Shiwani Mahajan, Shu-Xia Li, Sanket S Dhruva, Julian S Haimovich, Yongfei Wang, Wade L Schulz, Jeffrey M Testani, Francis P Wilson, Carlos I Mena, Frederick A Masoudi, John S Rumsfeld, John A Spertus, Bobak J Mortazavi and Harlan M Krumholz

PLOS Medicine, 2018, vol. 15, issue 11, 1-20

Abstract: Background: The current acute kidney injury (AKI) risk prediction model for patients undergoing percutaneous coronary intervention (PCI) from the American College of Cardiology (ACC) National Cardiovascular Data Registry (NCDR) employed regression techniques. This study aimed to evaluate whether models using machine learning techniques could significantly improve AKI risk prediction after PCI. Methods and findings: We used the same cohort and candidate variables used to develop the current NCDR CathPCI Registry AKI model, including 947,091 patients who underwent PCI procedures between June 1, 2009, and June 30, 2011. The mean age of these patients was 64.8 years, and 32.8% were women, with a total of 69,826 (7.4%) AKI events. We replicated the current AKI model as the baseline model and compared it with a series of new models. Temporal validation was performed using data from 970,869 patients undergoing PCIs between July 1, 2016, and March 31, 2017, with a mean age of 65.7 years; 31.9% were women, and 72,954 (7.5%) had AKI events. Each model was derived by implementing one of two strategies for preprocessing candidate variables (preselecting and transforming candidate variables or using all candidate variables in their original forms), one of three variable-selection methods (stepwise backward selection, lasso regularization, or permutation-based selection), and one of two methods to model the relationship between variables and outcome (logistic regression or gradient descent boosting). The cohort was divided into different training (70%) and test (30%) sets using 100 different random splits, and the performance of the models was evaluated internally in the test sets. The best model, according to the internal evaluation, was derived by using all available candidate variables in their original form, permutation-based variable selection, and gradient descent boosting. Compared with the baseline model that uses 11 variables, the best model used 13 variables and achieved a significantly better area under the receiver operating characteristic curve (AUC) of 0.752 (95% confidence interval [CI] 0.749–0.754) versus 0.711 (95% CI 0.708–0.714), a significantly better Brier score of 0.0617 (95% CI 0.0615–0.0618) versus 0.0636 (95% CI 0.0634–0.0638), and a better calibration slope of observed versus predicted rate of 1.008 (95% CI 0.988–1.028) versus 1.036 (95% CI 1.015–1.056). The best model also had a significantly wider predictive range (25.3% versus 21.6%, p

Date: 2018
References: Add references at CitEc
Citations: View citations in EconPapers (1)

Downloads: (external link)
https://journals.plos.org/plosmedicine/article?id=10.1371/journal.pmed.1002703 (text/html)
https://journals.plos.org/plosmedicine/article/fil ... 02703&type=printable (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:plo:pmed00:1002703

DOI: 10.1371/journal.pmed.1002703

Access Statistics for this article

More articles in PLOS Medicine from Public Library of Science
Bibliographic data for series maintained by plosmedicine ().

 
Page updated 2025-03-19
Handle: RePEc:plo:pmed00:1002703