Deep reinforcement learning-based multi-objective control of hybrid power system combined with road recognition under time-varying environment

Chen, Jiaxin; Shu, Hong; Tang, Xiaolin; Liu, Teng; Wang, Weida

Deep reinforcement learning-based multi-objective control of hybrid power system combined with road recognition under time-varying environment

Jiaxin Chen, Hong Shu, Xiaolin Tang, Teng Liu and Weida Wang

Energy, 2022, vol. 239, issue PC

Abstract: Aiming at promoting the intelligent development of control technology for new energy vehicles and showing the outstanding advantages of deep reinforcement learning (DRL), this paper trained a VGG16-based road recognition convolutional neural network (CNN) at first. Lots of high-definition images of five typical roads are collected by the racing game Dust Rally 2.0, including dry asphalt, wet asphalt, snow, dry cobblestone, and wet cobblestone. Then, a time-varying driving environment model was established, involving driving images, road slope, longitudinal speed, and the number of passengers. Finally, a stereoscopic control network suitable for nine-dimensional state space and three-dimensional action space was built, and for parallel hybrid electric vehicles (HEVs) with the P3 structure, a deep q-network (DQN)-based energy management strategy (EMS) achieving multi-objective control was proposed, including the fine-tuning strategy of the motor speed to maintain the optimal slip rate during braking, the engine power control strategy and the continuously variable transmission (CVT) gear ratio control strategy. Simulation results show under the influence of some factors such as tree shade and image compression, the road recognition network has the highest accuracy for snow roads and wet asphalt roads. Three types of control strategies learned simultaneously by the stereoscopic control network not only maintain the near-optimal slip rate in the braking period but also achieve a fuel consumption of 4788.93 g, while dynamic programming (DP)-based EMS gets a fuel consumption of 4295.61 g. Moreover, even DP-based EMS only contains three states and two actions, the time consumed for DP-based EMS and DQN-based EMS to run the speed cycle of 3602s is about 4911s and 10s, respectively. Therefore, the optimization and real-time performance of DRL-based EMS can be guaranteed.

Keywords: Hybrid electric vehicle; Road recognition network; Deep reinforcement learning; Multi-objective control network; Energy management strategy (search for similar items in EconPapers)
Date: 2022
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (4)

Downloads: (external link)
http://www.sciencedirect.com/science/article/pii/S0360544221023719
Full text for ScienceDirect subscribers only

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:eee:energy:v:239:y:2022:i:pc:s0360544221023719

DOI: 10.1016/j.energy.2021.122123

Access Statistics for this article

Energy is currently edited by Henrik Lund and Mark J. Kaiser

More articles in Energy from Elsevier
Bibliographic data for series maintained by Catherine Liu ().