EconPapers    
Economics at your fingertips  
 

Robust Learning for Optimal Dynamic Treatment Regimes with Observational Data

Shosei Sakaguchi

Papers from arXiv.org

Abstract: Public policies and medical interventions often involve dynamics in their treatment assignments, where individuals receive a series of interventions over multiple stages. We study the statistical learning of optimal dynamic treatment regimes (DTRs) that determine the optimal treatment assignment for each individual at each stage based on their evolving history. We propose a doubly robust, classification-based approach to learning the optimal DTR using observational data under the assumption of sequential ignorability. The approach learns the optimal DTR through backward induction. At each step, it constructs an augmented inverse probability weighting (AIPW) estimator of the policy value function and maximizes it to learn the optimal policy for the corresponding stage. We show that the resulting DTR achieves an optimal convergence rate of $n^{-1/2}$ for welfare regret under mild convergence conditions on estimators of the nuisance components.

Date: 2024-03, Revised 2025-02
New Economics Papers: this item is included in nep-ecm
References: View references in EconPapers View complete reference list from CitEc
Citations:

Downloads: (external link)
http://arxiv.org/pdf/2404.00221 Latest version (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:arx:papers:2404.00221

Access Statistics for this paper

More papers in Papers from arXiv.org
Bibliographic data for series maintained by arXiv administrators ().

 
Page updated 2025-03-19
Handle: RePEc:arx:papers:2404.00221