Hybrid Online and Offline Reinforcement Learning for Tibetan Jiu Chess
Xiali Li,
Zhengyu Lv,
Licheng Wu,
Yue Zhao and
Xiaona Xu
Complexity, 2020, vol. 2020, 1-11
Abstract:
In this study, hybrid state-action-reward-state-action (SARSA ) and Q-learning algorithms are applied to different stages of an upper confidence bound applied to tree search for Tibetan Jiu chess. Q-learning is also used to update all the nodes on the search path when each game ends. A learning strategy that uses SARSA and Q-learning algorithms combining domain knowledge for a feedback function for layout and battle stages is proposed. An improved deep neural network based on ResNet18 is used for self-play training. Experimental results show that hybrid online and offline reinforcement learning with a deep neural network can improve the game program’s learning efficiency and understanding ability for Tibetan Jiu chess.
Date: 2020
References: Add references at CitEc
Citations:
Downloads: (external link)
http://downloads.hindawi.com/journals/8503/2020/4708075.pdf (application/pdf)
http://downloads.hindawi.com/journals/8503/2020/4708075.xml (text/xml)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:hin:complx:4708075
DOI: 10.1155/2020/4708075
Access Statistics for this article
More articles in Complexity from Hindawi
Bibliographic data for series maintained by Mohamed Abdelhakeem (mohamed.abdelhakeem@hindawi.com).