Research on Intelligent Control Method of Launch Vehicle Landing Based on Deep Reinforcement Learning

Xue, Shuai; Bai, Hongyang; Zhao, Daxiang; Zhou, Junyan

Research on Intelligent Control Method of Launch Vehicle Landing Based on Deep Reinforcement Learning

Shuai Xue, Hongyang Bai (), Daxiang Zhao and Junyan Zhou
Additional contact information
Shuai Xue: School of Energy and Power Engineering, Nanjing University of Science and Technology, Nanjing 210094, China
Hongyang Bai: School of Energy and Power Engineering, Nanjing University of Science and Technology, Nanjing 210094, China
Daxiang Zhao: School of Automation, Nanjing University of Science and Technology, Nanjing 210094, China
Junyan Zhou: School of Automation, Nanjing University of Science and Technology, Nanjing 210094, China

Mathematics, 2023, vol. 11, issue 20, 1-17

Abstract: A launch vehicle needs to adapt to a complex flight environment during flight, and traditional guidance and control algorithms can hardly deal with multi-factor uncertainties due to the high dependency on control models. To solve this problem, this paper designs a new intelligent flight control method for a rocket based on the deep reinforcement learning algorithm driven by knowledge and data. In this process, the Markov decision process of the rocket landing section is established by designing a reinforcement function with consideration of the combination effect on the return of the terminal constraint of the launch vehicle and the cumulative return of the flight process of the rocket. Meanwhile, to improve the training speed of the landing process of the launch vehicle and to enhance the generalization ability of the model, the strategic neural network model is obtained and trained via the form of a long short-term memory (LSTM) network combined with a full connection layer as a landing guidance strategy network. The proximal policy optimization (PPO) is the training algorithm of reinforcement learning network parameters combined with behavioral cloning (BC) as the reinforcement learning pre-training imitation learning algorithm. Notably, the rocket-borne environment is transplanted to the Nvidia Jetson TX2 embedded platform for the comparative testing and verification of this intelligent model, which is then used to generate real-time control commands for guiding the actual flying and landing process of the rocket. Further, comparisons of the results obtained from convex landing optimization and the proposed method in this work are performed to prove the effectiveness of this proposed method. The simulation results show that the intelligent control method in this work can meet the landing accuracy requirements of the launch vehicle with a fast convergence speed of 84 steps, and the decision time is only 2.5 ms. Additionally, it has the ability of online autonomous decision making as deployed on the embedded platform.

Keywords: launch vehicle; landing phase; deep reinforcement learning; LSTM; imitation learning; embedded platform (search for similar items in EconPapers)
JEL-codes: C (search for similar items in EconPapers)
Date: 2023
References: View references in EconPapers View complete reference list from CitEc
Citations:

Downloads: (external link)
https://www.mdpi.com/2227-7390/11/20/4276/pdf (application/pdf)
https://www.mdpi.com/2227-7390/11/20/4276/ (text/html)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:gam:jmathe:v:11:y:2023:i:20:p:4276-:d:1259205

Access Statistics for this article

Mathematics is currently edited by Ms. Emma He

More articles in Mathematics from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().