A Q-learning based transient power optimization method for organic Rankine cycle waste heat recovery system in heavy duty diesel engine applications
Bin Xu and
Xiaoya Li
Applied Energy, 2021, vol. 286, issue C, No S0306261921000878
Abstract:
In recent years, the organic Rankine cycle waste heat recovery (ORC-WHR) technology gains popularity in heavy-duty diesel engine applications. Drastic fluctuations of the waste heat caused by variable daily operation of mobile heavy-duty trucks bring an extreme transient power optimization challenge to ORC-WHR systems. Existing power optimization methods either neglect transient behavior of the Rankine cycle system or compromise model accuracy for computation efficiency. Different from literature, this study first time proposes a model-free reinforcement learning method to achieve online transient power optimization for the ORC-WHR system and explains the benefits of learning method in this application. A tabular Q-learning is formulated to optimize the net power on an experimentally validated ORC-WHR system. Q-learning is explained in detail using states, action, and policy information. To quantify the power optimization of the proposed method, Proper-Integral-Derivative method, state-of-art offline and online Dynamic Programming methods are implemented. The results showed that Q-learning generated 22% more cumulative energy than the energy Proper-Integral-Derivative method generated. Furthermore, Q-learning produces 96.6% of cumulative energy that the offline Dynamic Programming generates over a transient engine condition, while it requires less computation cost and is executed online. Additionally, the Q-learning produces 0.5% more cumulative energy than the machine learning-based online Dynamic Programming results and exhibits better vapor temperature robustness than the online Dynamic Programming method (4 °C-28 °C superheat by Q-learning vs. 5 °C-94 °C superheat by online Dynamic Programming). Given the excellent power production performance, low computation cost requirement and high robustness, the proposed Q-learning method has the potential to improve the power production of the ORC-WHR system with different configurations.
Keywords: Waste heat recovery; Heavy-duty diesel engine; Q-learning; Transient power optimization (search for similar items in EconPapers)
Date: 2021
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (20)
Downloads: (external link)
http://www.sciencedirect.com/science/article/pii/S0306261921000878
Full text for ScienceDirect subscribers only
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:eee:appene:v:286:y:2021:i:c:s0306261921000878
Ordering information: This journal article can be ordered from
http://www.elsevier.com/wps/find/journaldescription.cws_home/405891/bibliographic
http://www.elsevier. ... 405891/bibliographic
DOI: 10.1016/j.apenergy.2021.116532
Access Statistics for this article
Applied Energy is currently edited by J. Yan
More articles in Applied Energy from Elsevier
Bibliographic data for series maintained by Catherine Liu ().