On the strategic learning of signal associations

Sherratt, Thomas N; Voll, James

On the strategic learning of signal associations

Fitting linear mixed-effects models using lme4

Thomas N Sherratt and James Voll

Behavioral Ecology, 2022, vol. 33, issue 6, 1058-1069

Abstract: Signal detection theory (SDT) has been widely used to identify the optimal response of a receiver to a stimulus when it could be generated by more than one signaler type. While SDT assumes that the receiver adopts the optimal response at the outset, in reality, receivers often have to learn how to respond. We, therefore, recast a simple signal detection problem as a multi-armed bandit (MAB) in which inexperienced receivers chose between accepting a signaler (gaining information and an uncertain payoff) and rejecting it (gaining no information but a certain payoff). An exact solution to this exploration–exploitation dilemma can be identified by solving the relevant dynamic programming equation (DPE). However, to evaluate how the problem is solved in practice, we conducted an experiment. Here humans (n = 135) were repeatedly presented with a four readily discriminable signaler types, some of which were on average profitable, and others unprofitable to accept in the long term. We then compared the performance of SDT, DPE, and three candidate exploration–exploitation models (Softmax, Thompson, and Greedy) in explaining the observed sequences of acceptance and rejection. All of the models predicted volunteer behavior well when signalers were clearly profitable or clearly unprofitable to accept. Overall however, the Softmax and Thompson sampling models, which predict the optimal (SDT) response towards signalers with borderline profitability only after extensive learning, explained the responses of volunteers significantly better. By highlighting the relationship between the MAB and SDT models, we encourage others to evaluate how receivers strategically learn about their environments.

Keywords: Bayesian learning; decision theory; dynamic programming; multi-armed bandit; signal detection theory; Softmax; Thompson sampling (search for similar items in EconPapers)
Date: 2022
References: Add references at CitEc
Citations:

Downloads: (external link)
http://hdl.handle.net/10.1093/beheco/arac027 (application/pdf)
Access to full text is restricted to subscribers.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:oup:beheco:v:33:y:2022:i:6:p:1058-1069.

Ordering information: This journal article can be ordered from
https://academic.oup.com/journals

Access Statistics for this article

Behavioral Ecology is currently edited by Louise Barrett

More articles in Behavioral Ecology from International Society for Behavioral Ecology Oxford University Press, Great Clarendon Street, Oxford OX2 6DP, UK.
Bibliographic data for series maintained by Oxford University Press ().