EconPapers    
Economics at your fingertips  
 

Optimal Oracle Inequalities for Projected Fixed-Point Equations, with Applications to Policy Evaluation

Wenlong Mou (), Ashwin Pananjady () and Martin J. Wainwright ()
Additional contact information
Wenlong Mou: University of California, Berkeley, Berkeley, California 94720
Ashwin Pananjady: Georgia Institute of Technology, Atlanta, Georgia 30332
Martin J. Wainwright: University of California, Berkeley, Berkeley, California 94720; Massachusetts Institute of Technology, Cambridge, Massachusetts 02139

Mathematics of Operations Research, 2023, vol. 48, issue 4, 2308-2336

Abstract: Linear fixed-point equations in Hilbert spaces arise in a variety of settings, including reinforcement learning, and computational methods for solving differential and integral equations. We study methods that use a collection of random observations to compute approximate solutions by searching over a known low-dimensional subspace of the Hilbert space. First, we prove an instance-dependent upper bound on the mean-squared error for a linear stochastic approximation scheme that exploits Polyak–Ruppert averaging. This bound consists of two terms: an approximation error term with an instance-dependent approximation factor and a statistical error term that captures the instance-specific complexity of the noise when projected onto the low-dimensional subspace. Using information-theoretic methods, we also establish lower bounds showing that both of these terms cannot be improved, again in an instance-dependent sense. A concrete consequence of our characterization is that the optimal approximation factor in this problem can be much larger than a universal constant. We show how our results precisely characterize the error of a class of temporal difference learning methods for the policy evaluation problem with linear function approximation, establishing their optimality.

Keywords: Primary: 62J05; 62M05; 65M15; 90C40; 93E24; linear fixed-point equations; Galerkin methods; temporal-difference methods; stochastic approximation; local minimax lower bounds (search for similar items in EconPapers)
Date: 2023
References: Add references at CitEc
Citations:

Downloads: (external link)
http://dx.doi.org/10.1287/moor.2022.1341 (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:inm:ormoor:v:48:y:2023:i:4:p:2308-2336

Access Statistics for this article

More articles in Mathematics of Operations Research from INFORMS Contact information at EDIRC.
Bibliographic data for series maintained by Chris Asher ().

 
Page updated 2025-03-19
Handle: RePEc:inm:ormoor:v:48:y:2023:i:4:p:2308-2336