Learning in Random Utility Models Via Online Decision Problems
Emerson Melo
Papers from arXiv.org
Abstract:
This paper studies the Random Utility Model (RUM) in a repeated stochastic choice situation, in which the decision maker is imperfectly informed about the payoffs of each available alternative. We develop a gradient-based learning algorithm by embedding the RUM into an online decision problem. We show that a large class of RUMs are Hannan consistent (\citet{Hahn1957}); that is, the average difference between the expected payoffs generated by a RUM and that of the best-fixed policy in hindsight goes to zero as the number of periods increase. In addition, we show that our gradient-based algorithm is equivalent to the Follow the Regularized Leader (FTRL) algorithm, which is widely used in the machine learning literature to model learning in repeated stochastic choice problems. Thus, we provide an economically grounded optimization framework to the FTRL algorithm. Finally, we apply our framework to study recency bias, no-regret learning in normal form games, and prediction markets.
Date: 2021-12, Revised 2022-08
New Economics Papers: this item is included in nep-big, nep-dcm and nep-upt
References: View references in EconPapers View complete reference list from CitEc
Citations:
Downloads: (external link)
http://arxiv.org/pdf/2112.10993 Latest version (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:arx:papers:2112.10993
Access Statistics for this paper
More papers in Papers from arXiv.org
Bibliographic data for series maintained by arXiv administrators ().