Optimizing pessimism in dynamic treatment regimes: a Bayesian learning approach
Yunzhe Zhou,
Zhengling Qi,
Chengchun Shi and
Lexin Li
LSE Research Online Documents on Economics from London School of Economics and Political Science, LSE Library
Abstract:
In this article, we propose a novel pessimismbased Bayesian learning method for optimal dynamic treatment regimes in the offline setting. When the coverage condition does not hold, which is common for offline data, the existing solutions would produce sub-optimal policies. The pessimism principle addresses this issue by discouraging recommendation of actions that are less explored conditioning on the state. However, nearly all pessimism-based methods rely on a key hyper-parameter that quantifies the degree of pessimism, and the performance of the methods can be highly sensitive to the choice of this parameter. We propose to integrate the pessimism principle with Thompson sampling and Bayesian machine learning for optimizing the degree of pessimism. We derive a credible set whose boundary uniformly lower bounds the optimal Q-function, and thus we do not require additional tuning of the degree of pessimism. We develop a general Bayesian learning method that works with a range of models, from Bayesian linear basis model to Bayesian neural network model. We develop the computational algorithm based on variational inference, which is highly efficient and scalable. We establish the theoretical guarantees of the proposed method, and show empirically that it outperforms the existing state-of-theart solutions through both simulations and a real data example.
JEL-codes: C1 (search for similar items in EconPapers)
Date: 2023-12-31
New Economics Papers: this item is included in nep-big, nep-cmp, nep-ecm and nep-ger
References: View references in EconPapers View complete reference list from CitEc
Citations:
Published in Proceedings of Machine Learning Research, 31, December, 2023, 206. ISSN: 1938-7228
Downloads: (external link)
http://eprints.lse.ac.uk/118233/ Open access version. (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:ehl:lserod:118233
Access Statistics for this paper
More papers in LSE Research Online Documents on Economics from London School of Economics and Political Science, LSE Library LSE Library Portugal Street London, WC2A 2HD, U.K.. Contact information at EDIRC.
Bibliographic data for series maintained by LSERO Manager ().