Reinforcement Learning for Non-stationary Discrete-Time Linear–Quadratic Mean-Field Games in Multiple Populations

Zaman, Muhammad Aneeq uz; Miehling, Erik; Başar, Tamer

Reinforcement Learning for Non-stationary Discrete-Time Linear–Quadratic Mean-Field Games in Multiple Populations

Muhammad Aneeq uz Zaman (), Erik Miehling and Tamer Başar
Additional contact information
Muhammad Aneeq uz Zaman: University of Illinois at Urbana–Champaign
Erik Miehling: University of Illinois at Urbana–Champaign
Tamer Başar: University of Illinois at Urbana–Champaign

Dynamic Games and Applications, 2023, vol. 13, issue 1, No 6, 118-164

Abstract: Abstract Scalability of reinforcement learning algorithms to multi-agent systems is a significant bottleneck to their practical use. In this paper, we approach multi-agent reinforcement learning from a mean-field game perspective, where the number of agents tends to infinity. Our analysis focuses on the structured setting of systems with linear dynamics and quadratic costs, named linear–quadratic mean-field games, evolving over a discrete-time infinite horizon where agents are assumed to be partitioned into finitely many populations connected by a network of known structure. The functional forms of the agents’ costs and dynamics are assumed to be the same within populations, but differ between populations. We first characterize the equilibrium of the mean-field game which further prescribes an $$\epsilon $$ ϵ -Nash equilibrium for the finite population game. Our main focus is on the design of a learning algorithm, based on zero-order stochastic optimization, for computing mean-field equilibria. The algorithm exploits the affine structure of both the equilibrium controller and equilibrium mean-field trajectory by decomposing the learning task into first learning the linear terms and then learning the affine terms. We present a convergence proof and a finite-sample bound quantifying the estimation error as a function of the number of samples.

Keywords: Mean-field games; Large population games on networks; Multi-agent reinforcement learning; Zero-order stochastic optimization (search for similar items in EconPapers)
Date: 2023
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (1)

Downloads: (external link)
http://link.springer.com/10.1007/s13235-022-00448-w Abstract (text/html)
Access to the full text of the articles in this series is restricted.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:spr:dyngam:v:13:y:2023:i:1:d:10.1007_s13235-022-00448-w

Ordering information: This journal article can be ordered from
http://www.springer.com/economics/journal/13235

DOI: 10.1007/s13235-022-00448-w

Access Statistics for this article

Dynamic Games and Applications is currently edited by Georges Zaccour

More articles in Dynamic Games and Applications from Springer
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().