Abstract:
This paper is concerned with the need for, and the implications of, $-optimality in learning problems. The authors consider a control problem in which a Bayesian decisionmaker faces a trade-off between expected current reward and accumulation of information. An example showing the need for the notion of $-optimality and the possibility of discontinuous transition functions is given. It is shown that there is always an $-optimal policy that allows the decisionmaker to learn any identified parameters, but that there are other $-optimal policies with very different limit behavior. Copyright 1989 by Economics Department of the University of Pennsylvania and the Osaka University Institute of Social and Economic Research Association.