Adapting Reinforcement Learning Treatment Policies Using Limited Data to Personalize Critical Care

Baucum, Matt; Khojandi, Anahita; Vasudevan, Rama; Davis, Robert

Adapting Reinforcement Learning Treatment Policies Using Limited Data to Personalize Critical Care

Matt Baucum (), Anahita Khojandi (), Rama Vasudevan () and Robert Davis ()
Additional contact information
Matt Baucum: Department of Business Analytics, Information Systems & Supply Chain, Florida State University, Tallahassee, Florida 32306
Anahita Khojandi: Department of Industrial and Systems Engineering, University of Tennessee, Knoxville, Tennessee 37996
Rama Vasudevan: Oak Ridge National Laboratory, Oak Ridge, Tennessee 37831
Robert Davis: University of Tennessee Health Science Center, Memphis, Tennessee 38163

INFORMS Joural on Data Science, 2022, vol. 1, issue 1, 27-49

Abstract: Reinforcement learning (RL) demonstrates promise for developing effective treatment policies in critical care settings. However, existing RL methods often require large and comprehensive patient data sets and do not readily lend themselves to settings in which certain patient subpopulations are severely underrepresented. In this study, we develop a new method, noisy Bayesian policy updates (NBPU), for selecting high-performing reinforcement learning–based treatment policies for underrepresented patient subpopulations using limited observations. Our method uses variational inference to learn a probability distribution over treatment policies based on a reference patient subpopulation for which sufficient data are available. It then exploits limited data from an underrepresented patient subpopulation to update this probability distribution and adapts its recommendations to this subpopulation. We demonstrate our method’s utility on a data set of ICU patients receiving intravenous blood anticoagulant medication. Our results show that NBPU outperforms state-of-the-art methods in terms of both selecting effective treatment policies for patients with nontypical clinical characteristics and predicting the corresponding policies’ performance for these patients.

Keywords: artificial neural networks and deep learning; stochastic processes; learning and adaptive systems in artificial intelligence; Markov and semi-Markov decision processes; Bayesian problems (search for similar items in EconPapers)
Date: 2022
References: View references in EconPapers View complete reference list from CitEc
Citations:

Downloads: (external link)
http://dx.doi.org/10.1287/ijds.2022.0015 (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:inm:orijds:v:1:y:2022:i:1:p:27-49

Access Statistics for this article

More articles in INFORMS Joural on Data Science from INFORMS Contact information at EDIRC.
Bibliographic data for series maintained by Chris Asher ().