Reinforcement Learning–Based Energy Management Strategy for a Hybrid Electric Tracked Vehicle

Liu, Teng; Zou, Yuan; Liu, Dexing; Sun, Fengchun

Reinforcement Learning–Based Energy Management Strategy for a Hybrid Electric Tracked Vehicle

Teng Liu, Yuan Zou, Dexing Liu and Fengchun Sun
Additional contact information
Teng Liu: Collaborative Innovation Center of Electric Vehicles in Beijing, School of Mechanical Engineering, Beijing Institute of Technology, Beijing 100081, China
Yuan Zou: Collaborative Innovation Center of Electric Vehicles in Beijing, School of Mechanical Engineering, Beijing Institute of Technology, Beijing 100081, China
Dexing Liu: Collaborative Innovation Center of Electric Vehicles in Beijing, School of Mechanical Engineering, Beijing Institute of Technology, Beijing 100081, China
Fengchun Sun: Collaborative Innovation Center of Electric Vehicles in Beijing, School of Mechanical Engineering, Beijing Institute of Technology, Beijing 100081, China

Energies, 2015, vol. 8, issue 7, 1-18

Abstract: This paper presents a reinforcement learning (RL)–based energy management strategy for a hybrid electric tracked vehicle. A control-oriented model of the powertrain and vehicle dynamics is first established. According to the sample information of the experimental driving schedule, statistical characteristics at various velocities are determined by extracting the transition probability matrix of the power request. Two RL-based algorithms, namely Q -learning and Dyna algorithms, are applied to generate optimal control solutions. The two algorithms are simulated on the same driving schedule, and the simulation results are compared to clarify the merits and demerits of these algorithms. Although the Q -learning algorithm is faster (3 h) than the Dyna algorithm (7 h), its fuel consumption is 1.7% higher than that of the Dyna algorithm. Furthermore, the Dyna algorithm registers approximately the same fuel consumption as the dynamic programming–based global optimal solution. The computational cost of the Dyna algorithm is substantially lower than that of the stochastic dynamic programming.

Keywords: reinforcement learning (RL); hybrid electric tracked vehicle (HETV); Q -learning algorithm; Dyna algorithm; dynamic programming (DP); stochastic dynamic programming (SDP) (search for similar items in EconPapers)
JEL-codes: Q Q0 Q4 Q40 Q41 Q42 Q43 Q47 Q48 Q49 (search for similar items in EconPapers)
Date: 2015
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (27)

Downloads: (external link)
https://www.mdpi.com/1996-1073/8/7/7243/pdf (application/pdf)
https://www.mdpi.com/1996-1073/8/7/7243/ (text/html)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:gam:jeners:v:8:y:2015:i:7:p:7243-7260:d:52695

Access Statistics for this article

Energies is currently edited by Ms. Agatha Cao

More articles in Energies from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().