Evolutionary Learning in a Principal-Agent Model

Arifovic, Jasmina; Karaivanov, Alexander

Evolutionary Learning in a Principal-Agent Model

Jasmina Arifovic and Alexander Karaivanov

No 505, 2006 Meeting Papers from Society for Economic Dynamics

Abstract: We introduce learning based on genetic algorithms in a principal-agent model of optimal contracting under moral hazard. Applications of this setting abound in finance (credit under moral hazard), public finance (optimal taxation, information-constrained insurance), development (sharecropping), mechanism design, etc. It is well known that optimal contracts in principal-agent problems with risk averse agents, unobserved labor effort and stochastic technology can take quite complicated forms due to the trade-off between provision of incentives and insurance. The optimal contract typically depends on both parties' preferences, the properties of the technology and the stochastic properties of the endowment/income process. The existing literature typically assumes that actions undertaken by the agent are unobserved by the principal while he is in perfect knowledge of realistically much harder (or at least as hard) things to know or observe such as the agent's preferences and decision making process. Few models of how the principal acquires this information exist up to our knowledge. A possible solution that we explore is to explicitly model the principal's learning process about the agent's preferences and the production technology based only on observable information such as output realizations (or messages about them). For simplicity we assume a repeated one-period contracting framework in an output-sharing model which can be thought of as a sharecropping or equity arrangement. An asset owner (principal) contract with an agent to produce jointly. The principal supplies the asset while the agent supplies unobservable labor effort. Output is stochastic and the probability of a given realization depends on the agent's effort. The principal wants to design and implement an optimal compensation scheme for the agent to maximize profits satisfying a participation constraint for the agent. Our primary goal is to investigate whether commonly used evolutionary learning algorithms lead to convergence to the underlying optimal contract under full rationality as studied by the mechanism design literature (e.g. Hart and Holmstrom, 1986) and if yes, how much time is needed. If on the other hand the optimal contract is never reached, we are interested in whether the learning process instead converges to some simple "rule of thumb" policy as often observed in reality. The exercise we perform can be evaluated from two opposing points of view depending on the readerâ€™s preferences. If commonly used learning algorithms fail to converge to the optimal contract in our simple framework one can interpret this on one hand as posing serious concerns about their applicability but on the other hand (if we believe that people use such algorithms to learn) this can be also interpreted as theory getting too far ahead of reality. Evolutionary learning algorithms such as genetic algorithms, classifier systems, genetic programming, evolutionary programming, etc. have been widely used in economic applications (see Arifovic, 2000 for a survey of applications in macroeconomics; LeBaron, 1999 for applications in finance; and Dawid, 1999 for a general overview). Many of these applications focus on models of social learning where a population of agents (each represented by a single strategy) evolves over time such that the entire population jointly implements a behavioral algorithm. In other applications (e.g. Arifovic, 1994; Marimon, McGrattan, and Sargent, 1989; Vriend, 2000) genetic algorithms are used in models of individual learning, where evolution takes place on a set of strategies belonging to an individual agent. We investigate the implications of both social and individual learning. First, we study a social learning model where agents update their strategies based on imitating strategies of those agents who performed better in the past and occasionally experimenting with new strategies. Evidence for such behavior in learning about new technologies exists for example in the development literature (Udry, 1994). We also study individual evolutionary learning (Arifovic and Ledyard, 2003) where agents learn only from their own experience. In each time period the agent chooses probabilistically one of the strategies from her set and implements it. The foregone payoffs of all strategies are updated based on the observed outcomes. The strategy set is then updated by reinforcing the frequencies of strategies with relatively high payoffs and by adding new strategies. We also do various robustness checks varying the parameters of the learning algorithms and study the speed of convergence. The results show that social learning converges to the optimal contract under full rationality while individual learning fails. The intuition for the failure of individual learning is that when evaluating foregone payoffs of potential strategies that have not been tried the principal assumes that agent's action will remain constant (as if they play Nash) while in reality the optimal contract involves an optimal response to the agent's best response function as in a Stackelberg setting. The inability of individual learning to produce correct payoffs for the principal's strategies undermines its convergence to the optimal profit maximizing contract. In contrast, social learning involves evaluating only strategy payoffs that have been actually implemented by some principals in the economy thus circumventing the above problem. This failure of the model of individual learning where foregone payoffs are taken into account is in stark contrast to the findings reported in the existing literature. Various studies (see, for example, Camerer and Ho, 1998; Camerer, 2003; Arifovic and Ledyard, 2004) find that the performance of these models, when evaluated against evidence from experiments with human subjects, is superior to the performance of the learning models where only actual strategy payoffs are taken into account.

Keywords: learning; moral hazard; optimal contracting; genetic algorithms (search for similar items in EconPapers)
JEL-codes: C6 D82 D83 (search for similar items in EconPapers)
Date: 2006
References: Add references at CitEc
Citations:

There are no downloads for this item, see the EconPapers FAQ for hints about obtaining it.

Related works:
Working Paper: Evolutionary Learning in Principal/Agent Models (2006)
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:red:sed006:505

Access Statistics for this paper

More papers in 2006 Meeting Papers from Society for Economic Dynamics Society for Economic Dynamics Marina Azzimonti Department of Economics Stonybrook University 10 Nicolls Road Stonybrook NY 11790 USA. Contact information at EDIRC.
Bibliographic data for series maintained by Christian Zimmermann ().