EconPapers    
Economics at your fingertips  
 

Multiple UAVs Path Planning Based on Deep Reinforcement Learning in Communication Denial Environment

Yahao Xu, Yiran Wei (), Keyang Jiang, Di Wang and Hongbin Deng
Additional contact information
Yahao Xu: School of Mechatronical Engineering, Beijing Institute of Technology, 5th South Zhongguancun Street, Beijing 100081, China
Yiran Wei: School of Mechatronical Engineering, Beijing Institute of Technology, 5th South Zhongguancun Street, Beijing 100081, China
Keyang Jiang: School of Mechatronical Engineering, Beijing Institute of Technology, 5th South Zhongguancun Street, Beijing 100081, China
Di Wang: School of Mechatronical Engineering, Beijing Institute of Technology, 5th South Zhongguancun Street, Beijing 100081, China
Hongbin Deng: School of Mechatronical Engineering, Beijing Institute of Technology, 5th South Zhongguancun Street, Beijing 100081, China

Mathematics, 2023, vol. 11, issue 2, 1-15

Abstract: In this paper, we propose a C51-Duel-IP (C51 Dueling DQN with Independent Policy) dynamic destination path-planning algorithm to solve the problem of autonomous navigation and avoidance of multiple Unmanned Aerial Vehicles (UAVs) in the communication denial environment. Our proposed algorithm expresses the Q function output by the Dueling network as a Q distribution, which improves the fitting ability of the Q value. We also extend the single-step temporal differential (TD) to the N-step timing differential, which solves the problem of inflexible updates of the single-step temporal differential. More importantly, we use an independent policy to achieve autonomous avoidance and navigation of multiple UAVs without any communication with each other. In the case of communication rejection, the independent policy can achieve the consistency of multiple UAVs and avoid the greedy behavior of UAVs. In multiple-UAV dynamic destination scenarios, our work includes path planning, taking off from different initial positions, and dynamic path planning, taking off from the same initial position. The hardware-in-the-loop (HITL) experiment results show that our C51-Duel-IP algorithm is much more robust and effective than the original Dueling-IP and DQN-IP algorithms in an urban simulation environment. Our independent policy algorithm has similar effects as the shared policy but with the significant advantage of running in a communication denial environment.

Keywords: multi-agent reinforcement learning; UAV path planning; visual perception; communication denial (search for similar items in EconPapers)
JEL-codes: C (search for similar items in EconPapers)
Date: 2023
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (1)

Downloads: (external link)
https://www.mdpi.com/2227-7390/11/2/405/pdf (application/pdf)
https://www.mdpi.com/2227-7390/11/2/405/ (text/html)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:gam:jmathe:v:11:y:2023:i:2:p:405-:d:1034231

Access Statistics for this article

Mathematics is currently edited by Ms. Emma He

More articles in Mathematics from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().

 
Page updated 2025-03-19
Handle: RePEc:gam:jmathe:v:11:y:2023:i:2:p:405-:d:1034231