Hybrid Online and Offline Reinforcement Learning for Tibetan Jiu Chess

Li, Xiali; Lv, Zhengyu; Wu, Licheng; Zhao, Yue; Xu, Xiaona

Hybrid Online and Offline Reinforcement Learning for Tibetan Jiu Chess

Xiali Li, Zhengyu Lv, Licheng Wu, Yue Zhao and Xiaona Xu

Complexity, 2020, vol. 2020, 1-11

Abstract:

In this study, hybrid state-action-reward-state-action (SARSA ) and Q-learning algorithms are applied to different stages of an upper confidence bound applied to tree search for Tibetan Jiu chess. Q-learning is also used to update all the nodes on the search path when each game ends. A learning strategy that uses SARSA and Q-learning algorithms combining domain knowledge for a feedback function for layout and battle stages is proposed. An improved deep neural network based on ResNet18 is used for self-play training. Experimental results show that hybrid online and offline reinforcement learning with a deep neural network can improve the game programâ€™s learning efficiency and understanding ability for Tibetan Jiu chess.

Date: 2020
References: Add references at CitEc
Citations:

Downloads: (external link)
http://downloads.hindawi.com/journals/8503/2020/4708075.pdf (application/pdf)
http://downloads.hindawi.com/journals/8503/2020/4708075.xml (text/xml)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:hin:complx:4708075

DOI: 10.1155/2020/4708075

Access Statistics for this article

More articles in Complexity from Hindawi
Bibliographic data for series maintained by Mohamed Abdelhakeem ().