Coarse Q-learning in Decision-Making: Indifference vs. Indeterminacy vs. Instability
Philippe Jehiel () and
Aviman Satpathy
Papers from arXiv.org
Abstract:
We introduce Coarse Q-learning (CQL), a reinforcement learning model of decision-making under payoff uncertainty where alternatives are exogenously partitioned into coarse similarity classes (based on limited salience) and the agent maintains estimates (valuations) of expected payoffs only at the class level. Choices are modeled as softmax (multinomial logit) over class valuations and uniform within class; and valuations update toward realized payoffs as in classical Q-learning with stochastic bandit feedback (Watkins and Dayan, 1992). Using stochastic approximation, we derive a continuous-time ODE limit of CQL dynamics and show that its steady states coincide with smooth (logit) perturbations of Valuation Equilibria (Jehiel and Samet, 2007). We demonstrate the possibility of multiple equilibria in decision trees with generic payoffs and establish local asymptotic stability of strict pure equilibria whenever they exist. By contrast, we provide sufficient conditions on the primitives under which every decision tree admits a unique, globally asymptotically stable mixed equilibrium that renders the agent indifferent across classes as sensitivity to payoff differences diverges. Nevertheless, convergence to equilibrium is not universal: we construct an open set of decision trees where the unique mixed equilibrium is linearly unstable and the valuations converge to a stable limit cycle - with choice probabilities perpetually oscillating.
Date: 2024-12, Revised 2025-12
New Economics Papers: this item is included in nep-mic
References: View references in EconPapers View complete reference list from CitEc
Citations:
Downloads: (external link)
http://arxiv.org/pdf/2412.09321 Latest version (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:arx:papers:2412.09321
Access Statistics for this paper
More papers in Papers from arXiv.org
Bibliographic data for series maintained by arXiv administrators ().