Research on Autonomous Manoeuvre Decision Making in Within-Visual-Range Aerial Two-Player Zero-Sum Games Based on Deep Reinforcement Learning

Lu, Bo; Ru, Le; Hu, Shiguang; Wang, Wenfei; Xi, Hailong; Zhao, Xiaolin

Research on Autonomous Manoeuvre Decision Making in Within-Visual-Range Aerial Two-Player Zero-Sum Games Based on Deep Reinforcement Learning

Bo Lu, Le Ru (), Shiguang Hu, Wenfei Wang, Hailong Xi and Xiaolin Zhao
Additional contact information
Bo Lu: Equipment Management and UAV Engineering College, Air Force Engineering University, Xi’an 710051, China
Le Ru: Equipment Management and UAV Engineering College, Air Force Engineering University, Xi’an 710051, China
Shiguang Hu: Equipment Management and UAV Engineering College, Air Force Engineering University, Xi’an 710051, China
Wenfei Wang: Equipment Management and UAV Engineering College, Air Force Engineering University, Xi’an 710051, China
Hailong Xi: Equipment Management and UAV Engineering College, Air Force Engineering University, Xi’an 710051, China
Xiaolin Zhao: Equipment Management and UAV Engineering College, Air Force Engineering University, Xi’an 710051, China

Mathematics, 2024, vol. 12, issue 14, 1-16

Abstract: In recent years, with the accelerated development of technology towards automation and intelligence, autonomous decision-making capabilities in unmanned systems are poised to play a crucial role in contemporary aerial two-player zero-sum games (TZSGs). Deep reinforcement learning (DRL) methods enable agents to make autonomous manoeuvring decisions. This paper focuses on current mainstream DRL algorithms based on fundamental tactical manoeuvres, selecting a typical aerial TZSG scenario—within visual range (WVR) combat. We model the key elements influencing the game using a Markov decision process (MDP) and demonstrate the mathematical foundation for implementing DRL. Leveraging high-fidelity simulation software (Warsim v1.0), we design a prototypical close-range aerial combat scenario. Utilizing this environment, we train mainstream DRL algorithms and analyse the training outcomes. The effectiveness of these algorithms in enabling agents to manoeuvre in aerial TZSG autonomously is summarised, providing a foundational basis for further research.

Keywords: WVR; TZSG; deep reinforcement learning; Markov decision processes; decision making (search for similar items in EconPapers)
JEL-codes: C (search for similar items in EconPapers)
Date: 2024
References: View references in EconPapers View complete reference list from CitEc
Citations:

Downloads: (external link)
https://www.mdpi.com/2227-7390/12/14/2160/pdf (application/pdf)
https://www.mdpi.com/2227-7390/12/14/2160/ (text/html)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:gam:jmathe:v:12:y:2024:i:14:p:2160-:d:1432234

Access Statistics for this article

Mathematics is currently edited by Ms. Emma He

More articles in Mathematics from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().