Hybrid Deep Reinforcement Learning Considering Discrete-Continuous Action Spaces for Real-Time Energy Management in More Electric Aircraft
Bing Liu,
Bowen Xu,
Tong He,
Wei Yu and
Fanghong Guo ()
Additional contact information
Bing Liu: College of Information Engineering, Zhejiang University of Technology, Hangzhou 310023, China
Bowen Xu: College of Control Science and Engineering, Zhejiang University, Hangzhou 310027, China
Tong He: College of Information Engineering, Zhejiang University of Technology, Hangzhou 310023, China
Wei Yu: Green Rooftop Inc., Hangzhou 310032, China
Fanghong Guo: College of Information Engineering, Zhejiang University of Technology, Hangzhou 310023, China
Energies, 2022, vol. 15, issue 17, 1-21
Abstract:
The increasing number and functional complexity of power electronics in more electric aircraft (MEA) power systems have led to a high degree of complexity in modelling and computation, making real-time energy management a formidable challenge, and the discrete-continuous action space of the MEA system under consideration also poses a challenge to existing DRL algorithms. Therefore, this paper proposes an optimisation strategy for real-time energy management based on hybrid deep reinforcement learning (HDRL). An energy management model of the MEA power system is constructed for the analysis of generators, buses, loads and energy storage system (ESS) characteristics, and the problem is described as a multi-objective optimisation problem with integer and continuous variables. The problem is solved by combining a duelling double deep Q network (D3QN) algorithm with a deep deterministic policy gradient (DDPG) algorithm, where the D3QN algorithm deals with the discrete action space and the DDPG algorithm with the continuous action space. These two algorithms are alternately trained and interact with each other to maximize the long-term payoff of MEA. Finally, the simulation results show that the effectiveness of the method is verified under different generator operating conditions. For different time lengths T , the method always obtains smaller objective function values compared to previous DRL algorithms, is several orders of magnitude faster than commercial solvers, and is always less than 0.2 s, despite a slight shortfall in solution accuracy. In addition, the method has been validated on a hardware-in-the-loop simulation platform.
Keywords: more electric aircraft; real-time energy management; discrete-continuous hybrid action space; hybrid deep reinforcement learning (HDRL) (search for similar items in EconPapers)
JEL-codes: Q Q0 Q4 Q40 Q41 Q42 Q43 Q47 Q48 Q49 (search for similar items in EconPapers)
Date: 2022
References: View references in EconPapers View complete reference list from CitEc
Citations:
Downloads: (external link)
https://www.mdpi.com/1996-1073/15/17/6323/pdf (application/pdf)
https://www.mdpi.com/1996-1073/15/17/6323/ (text/html)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:gam:jeners:v:15:y:2022:i:17:p:6323-:d:901670
Access Statistics for this article
Energies is currently edited by Ms. Agatha Cao
More articles in Energies from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().