Interactive model building for Q-learning
Eric B. Laber,
Kristin A. Linn and
Leonard A. Stefanski
Biometrika, 2014, vol. 101, issue 4, 831-847
Abstract:
Evidence-based rules for optimal treatment allocation are key components in the quest for efficient, effective health-care delivery. Q-learning, an approximate dynamic programming algorithm, is a popular method for estimating optimal sequential decision rules from data. Q-learning requires the modelling of nonsmooth, nonmonotone transformations of the data, complicating the search for adequately expressive, yet parsimonious, statistical models. The default Q-learning working model is multiple linear regression, which not only is misspecified under most data-generating models but also results in nonregular regression estimators, complicating inference. We propose an alternative strategy for estimating optimal sequential decision rules for which the requisite statistical modelling does not depend on nonsmooth, nonmonotone transformed data, does not result in nonregular regression estimators, is consistent under more data-generation models than is Q-learning, results in estimated sequential decision rules that have better sampling properties, and is amenable to established statistical methods for exploratory data analysis, model building and validation. We derive the new method, IQ-learning, via an interchange in the order of certain steps in Q-learning. In simulated experiments, IQ-learning improves upon Q-learning in terms of integrated mean-squared error and power. The method is illustrated using data from a study of major depressive disorder.
Date: 2014
References: View complete reference list from CitEc
Citations: View citations in EconPapers (5)
Downloads: (external link)
http://hdl.handle.net/10.1093/biomet/asu043 (application/pdf)
Access to full text is restricted to subscribers.
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:oup:biomet:v:101:y:2014:i:4:p:831-847.
Ordering information: This journal article can be ordered from
https://academic.oup.com/journals
Access Statistics for this article
Biometrika is currently edited by Paul Fearnhead
More articles in Biometrika from Biometrika Trust Oxford University Press, Great Clarendon Street, Oxford OX2 6DP, UK.
Bibliographic data for series maintained by Oxford University Press ().