Reinforcement Learning with Restrictions on the Action Set

Bravo, Mario; Faure, Mathieu

Reinforcement Learning with Restrictions on the Action Set

Mario Bravo and Mathieu Faure ()
Additional contact information
Mario Bravo: USACH - Universidad de Santiago de Chile [Santiago]

Post-Print from HAL

Abstract: Consider a two-player normal-form game repeated over time. We introduce an adaptive learning procedure, where the players only observe their own realized payoff at each stage. We assume that agents do not know their own payoff function and have no information on the other player. Furthermore, we assume that they have restrictions on their own actions such that, at each stage, their choice is limited to a subset of their action set. We prove that the empirical distributions of play converge to the set of Nash equilibria for zero-sum and potential games, and games where one player has two actions.

Keywords: Economie; quantitative (search for similar items in EconPapers)
Date: 2015-01
References: Add references at CitEc
Citations: View citations in EconPapers (1)

Published in SIAM Journal on Control and Optimization, 2015, 53 (1), pp.287--312. ⟨10.1137/130936488⟩

There are no downloads for this item, see the EconPapers FAQ for hints about obtaining it.

Related works:
Working Paper: Reinforcement Learning with Restrictions on the Action Set (2013)
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:hal:journl:hal-01457301

DOI: 10.1137/130936488

Access Statistics for this paper

More papers in Post-Print from HAL
Bibliographic data for series maintained by CCSD ().