Can Monte-Carlo Tree Search learn to sacrifice?

Companez, Nathan; Aleti, Aldeida

Can Monte-Carlo Tree Search learn to sacrifice?

Nathan Companez () and Aldeida Aleti ()
Additional contact information
Nathan Companez: Monash University
Aldeida Aleti: Monash University

Journal of Heuristics, 2016, vol. 22, issue 6, No 1, 783-813

Abstract: Abstract One of the most basic activities performed by an intelligent agent is deciding what to do next. The decision is usually about selecting the move with the highest expectation, or exploring new scenarios. Monte-Carlo Tree Search (MCTS), which was developed as a game playing agent, deals with this exploration–exploitation ‘dilemma’ using a multi-armed bandits strategy. The success of MCTS in a wide range of problems, such as combinatorial optimisation, reinforcement learning, and games, is due to its ability to rapidly evaluate problem states without requiring domain-specific knowledge. However, it has been acknowledged that the trade-off between exploration and exploitation is crucial for the performance of the algorithm, and affects the efficiency of the agent in learning deceptive states. One type of deception is states that give immediate rewards, but lead to a suboptimal solution in the long run. These states are known as trap states, and have been thoroughly investigated in previous research. In this work, we study the opposite of trap states, known as sacrifice states, which are deceptive moves that result in a local loss but are globally optimal, and investigate the efficiency of MCTS enhancements in identifying this type of moves.

Keywords: Monte-Carlo Tree Search; Sacrifice moves; Artificial intelligence; Games (search for similar items in EconPapers)
Date: 2016
References: View references in EconPapers View complete reference list from CitEc
Citations:

Downloads: (external link)
http://link.springer.com/10.1007/s10732-016-9320-y Abstract (text/html)
Access to the full text of the articles in this series is restricted.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:spr:joheur:v:22:y:2016:i:6:d:10.1007_s10732-016-9320-y

Ordering information: This journal article can be ordered from
http://www.springer.com/journal/10732

DOI: 10.1007/s10732-016-9320-y

Access Statistics for this article

Journal of Heuristics is currently edited by Manuel Laguna

More articles in Journal of Heuristics from Springer
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().