OPTIMAL PAYOFF FUNCTIONS FOR MEMBERS OF COLLECTIVES
David H. Wolpert () and
Kagan Tumer ()
Additional contact information
David H. Wolpert: NASA Ames Research Center, Moffett Field, CA 94035, USA
Kagan Tumer: NASA Ames Research Center, Moffett Field, CA 94035, USA
Advances in Complex Systems (ACS), 2001, vol. 04, issue 02n03, 265-279
Abstract:
We consider the problem of designing (perhaps massively distributed) collectives of computational processes to maximize a provided "world utility" function. We consider this problem when the behavior of each process in the collective can be cast as striving to maximize its own payoff utility function. For such cases the central design issue is how to initialize/update those payoff utility functions of the individual processes so as to induce behavior of the entire collective having good values of the world utility. Traditional "team game" approaches to this problem simply assign to each process the world utility as its payoff utility function. In previous work we used the "Collective Intelligence" (COIN) framework to derive a better choice of payoff utility functions, one that results in world utility performance up to orders of magnitude superior to that ensuing from the use of the team game utility. In this paper, we extend these results using a novel mathematical framework. Under that new framework we review the derivation of the general class of payoff utility functions that both (i) are easy for the individual processes to try to maximize, and (ii) have the property that if good values of them are achieved, then we are assured a high value of world utility. These are the "Aristocrat Utility" and a new variant of the "Wonderful Life Utility" that was introduced in the previous COIN work. We demonstrate experimentally that using these new utility functions can result in significantly improved performance over that of previously investigated COIN payoff utilities, over and above those previous utilities' superiority to the conventional team game utility. These results also illustrate the substantial superiority of these payoff functions to perhaps the most natural version of the economics technique of "endogenizing externalities."
Keywords: Distributed control; distributive learning; reinforcement learning; collective intelligence; El Farol Bar problem; clamping parameter; wonderful life utility; aristocrat utility; team game (search for similar items in EconPapers)
Date: 2001
References: Add references at CitEc
Citations: View citations in EconPapers (1)
Downloads: (external link)
http://www.worldscientific.com/doi/abs/10.1142/S0219525901000188
Access to full text is restricted to subscribers
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:wsi:acsxxx:v:04:y:2001:i:02n03:n:s0219525901000188
Ordering information: This journal article can be ordered from
DOI: 10.1142/S0219525901000188
Access Statistics for this article
Advances in Complex Systems (ACS) is currently edited by Frank Schweitzer
More articles in Advances in Complex Systems (ACS) from World Scientific Publishing Co. Pte. Ltd.
Bibliographic data for series maintained by Tai Tone Lim ().