EconPapers    
Economics at your fingertips  
 

Coarse Q-learning in Decision-Making: Indifference vs. Indeterminacy vs. Instability

Philippe Jehiel () and Aviman Satpathy

Papers from arXiv.org

Abstract: We introduce Coarse Q-learning (CQL), a reinforcement learning model of decision-making under payoff uncertainty where alternatives are exogenously partitioned into coarse similarity classes (based on limited salience) and the agent maintains estimates (valuations) of expected payoffs only at the class level. Choices are modeled as softmax (multinomial logit) over class valuations and uniform within class; and valuations update toward realized payoffs as in classical Q-learning with stochastic bandit feedback (Watkins and Dayan, 1992). Using stochastic approximation, we derive a continuous-time ODE limit of CQL dynamics and show that its steady states coincide with smooth (logit) perturbations of Valuation Equilibria (Jehiel and Samet, 2007). We demonstrate the possibility of multiple equilibria in decision trees with generic payoffs and establish local asymptotic stability of strict pure equilibria whenever they exist. By contrast, we provide sufficient conditions on the primitives under which every decision tree admits a unique, globally asymptotically stable mixed equilibrium that renders the agent indifferent across classes as sensitivity to payoff differences diverges. Nevertheless, convergence to equilibrium is not universal: we construct an open set of decision trees where the unique mixed equilibrium is linearly unstable and the valuations converge to a stable limit cycle - with choice probabilities perpetually oscillating.

Date: 2024-12, Revised 2025-12
New Economics Papers: this item is included in nep-mic
References: View references in EconPapers View complete reference list from CitEc
Citations:

Downloads: (external link)
http://arxiv.org/pdf/2412.09321 Latest version (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:arx:papers:2412.09321

Access Statistics for this paper

More papers in Papers from arXiv.org
Bibliographic data for series maintained by arXiv administrators ().

 
Page updated 2026-01-13
Handle: RePEc:arx:papers:2412.09321