Abstract:
We analyze the evolution of behavioral rules for learning how to play a two-armed bandit. Individuals have no information about the underlying pay-off distributions and have limited memory about their own past experience. Instead they must rely on information obtained through observing the per-formance of other individuals. Evolution is modelled using the replicator dynamic with the revision behaviors as replicators. We find that evolution favors a special class of imitative rules. These so-called strictly improving rules (Schlag, 1996) are found to be neutrally stable when facing any two-armed bandit. Further emphasis is put on which rules survive when.