Pruning Stochastic Game Trees Using Neural Networks for Reduced Action Space Approximation

Papagiannis, Tasos; Alexandridis, Georgios; Stafylopatis, Andreas

Pruning Stochastic Game Trees Using Neural Networks for Reduced Action Space Approximation

Tasos Papagiannis, Georgios Alexandridis and Andreas Stafylopatis
Additional contact information
Tasos Papagiannis: Zografou Campus, School of Electrical & Computer Engineering, National Technical University of Athens, 15780 Athens, Greece
Georgios Alexandridis: Zografou Campus, School of Electrical & Computer Engineering, National Technical University of Athens, 15780 Athens, Greece
Andreas Stafylopatis: Zografou Campus, School of Electrical & Computer Engineering, National Technical University of Athens, 15780 Athens, Greece

Mathematics, 2022, vol. 10, issue 9, 1-16

Abstract: Monte Carlo Tree Search has proved to be very efficient in the broad domain of Game AI, though it suffers from high dimensionality in cases of large branching factors. Several pruning techniques have been proposed to tackle this problem, most of which require explicit domain knowledge. In this study, an approach using neural networks to determine the number of actions to be pruned, depending on the iterations run and the total number of possible actions, is proposed. Multi-armed bandit simulations with the UCB1 formula are employed to generate suitable datasets for the networks’ training and a specifically designed process is followed to select the best combination of the number of iterations and actions for pruning. Two pruning Monte Carlo Tree Search variants are investigated, based on different actions’ expected rewards’ distributions, and they are evaluated in the collectible card game Hearthstone. The proposed technique improves the performance of the Monte Carlo Tree Search algorithm in different setups of computational limitations regarding the available number of tree search iterations and is significantly boosted when combined with supervised learning trained-state value predicting models.

Keywords: Monte Carlo Tree Search; pruning; neural networks; multi-armed bandit; Upper Confidence Bound; Hearthstone (search for similar items in EconPapers)
JEL-codes: C (search for similar items in EconPapers)
Date: 2022
References: View references in EconPapers View complete reference list from CitEc
Citations:

Downloads: (external link)
https://www.mdpi.com/2227-7390/10/9/1509/pdf (application/pdf)
https://www.mdpi.com/2227-7390/10/9/1509/ (text/html)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:gam:jmathe:v:10:y:2022:i:9:p:1509-:d:807146

Access Statistics for this article

Mathematics is currently edited by Ms. Emma He

More articles in Mathematics from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().