Deep Reinforcement Learning Algorithm with Long Short-Term Memory Network for Optimizing Unmanned Aerial Vehicle Information Transmission

He, Yufei; Hu, Ruiqi; Liang, Kewei; Liu, Yonghong; Zhou, Zhiyuan

Deep Reinforcement Learning Algorithm with Long Short-Term Memory Network for Optimizing Unmanned Aerial Vehicle Information Transmission

Yufei He, Ruiqi Hu, Kewei Liang (), Yonghong Liu and Zhiyuan Zhou
Additional contact information
Yufei He: Polytechnic Institute, Zhejiang University, Hangzhou 310015, China
Ruiqi Hu: Department of Applied Mathematics, Hong Kong Polytechnic University, Hong Kong, China
Kewei Liang: School of Mathematical Sciences, Zhejiang University, Hangzhou 310058, China
Yonghong Liu: School of Mathematical Sciences, Zhejiang University, Hangzhou 310058, China
Zhiyuan Zhou: Applied Mathematics, Beijing Normal University—Hong Kong Baptist University United International College, Zhuhai 519087, China

Mathematics, 2024, vol. 13, issue 1, 1-18

Abstract: The optimization of information transmission in unmanned aerial vehicles (UAVs) is essential for enhancing their operational efficiency across various applications. This issue is framed as a mixed-integer nonconvex optimization challenge, which traditional optimization algorithms and reinforcement learning (RL) methods often struggle to address effectively. In this paper, we propose a novel deep reinforcement learning algorithm that utilizes a hybrid discrete–continuous action space. To address the long-term dependency issues inherent in UAV operations, we incorporate a long short-term memory (LSTM) network. Our approach accounts for the specific flight constraints of fixed-wing UAVs and employs a continuous policy network to facilitate real-time flight path planning. A non-sparse reward function is designed to maximize data collection from internet of things (IoT) devices, thus guiding the UAV to optimize its operational efficiency. Experimental results demonstrate that the proposed algorithm yields near-optimal flight paths and significantly improves data collection capabilities, compared to conventional heuristic methods, achieving an improvement of up to 10.76%. Validation through simulations confirms the effectiveness and practicality of the proposed approach in real-world scenarios.

Keywords: unmanned aerial vehicle (UAV); deep reinforcement learning (DRL); long short-term memory (LSTM); optimal control; nonconvex optimization (search for similar items in EconPapers)
JEL-codes: C (search for similar items in EconPapers)
Date: 2024
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (1)

Downloads: (external link)
https://www.mdpi.com/2227-7390/13/1/46/pdf (application/pdf)
https://www.mdpi.com/2227-7390/13/1/46/ (text/html)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:gam:jmathe:v:13:y:2024:i:1:p:46-:d:1554068

Access Statistics for this article

Mathematics is currently edited by Ms. Emma He

More articles in Mathematics from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().