Restless bandits, linear programming relaxations and a primal-dual index heuristic
Dimitris Bertsimas and
José Niño-Mora
Economics Working Papers from Department of Economics and Business, Universitat Pompeu Fabra
Abstract:
We develop a mathematical programming approach for the classical PSPACE - hard restless bandit problem in stochastic optimization. We introduce a hierarchy of n (where n is the number of bandits) increasingly stronger linear programming relaxations, the last of which is exact and corresponds to the (exponential size) formulation of the problem as a Markov decision chain, while the other relaxations provide bounds and are efficiently computed. We also propose a priority-index heuristic scheduling policy from the solution to the first-order relaxation, where the indices are defined in terms of optimal dual variables. In this way we propose a policy and a suboptimality guarantee. We report results of computational experiments that suggest that the proposed heuristic policy is nearly optimal. Moreover, the second-order relaxation is found to provide strong bounds on the optimal value.
Keywords: Stochastic scheduling; bandit problems; resource allocation; dynamic programming (search for similar items in EconPapers)
JEL-codes: C60 C61 (search for similar items in EconPapers)
Date: 1994-08, Revised 1997-10
References: View references in EconPapers View complete reference list from CitEc
Citations:
Downloads: (external link)
https://econ-papers.upf.edu/papers/301.pdf Whole Paper (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:upf:upfgen:301
Access Statistics for this paper
More papers in Economics Working Papers from Department of Economics and Business, Universitat Pompeu Fabra
Bibliographic data for series maintained by ( this e-mail address is bad, please contact ).