Tracking Photovoltaic Power Output Schedule of the Energy Storage System Based on Reinforcement Learning

Guo, Meijun; Ren, Mifeng; Chen, Junghui; Cheng, Lan; Yang, Zhile

Tracking Photovoltaic Power Output Schedule of the Energy Storage System Based on Reinforcement Learning

Meijun Guo, Mifeng Ren (), Junghui Chen (), Lan Cheng and Zhile Yang
Additional contact information
Meijun Guo: College of Electrical and Power Engineering, Taiyuan University of Technology, Taiyuan 030024, China
Mifeng Ren: College of Electrical and Power Engineering, Taiyuan University of Technology, Taiyuan 030024, China
Junghui Chen: Department of Chemical Engineering, Chung-Yuan Christian University, Taoyuan 320314, Taiwan
Lan Cheng: College of Electrical and Power Engineering, Taiyuan University of Technology, Taiyuan 030024, China
Zhile Yang: Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China

Energies, 2023, vol. 16, issue 15, 1-15

Abstract: The inherent randomness, fluctuation, and intermittence of photovoltaic power generation make it difficult to track the scheduling plan. To improve the ability to track the photovoltaic plan to a greater extent, a real-time charge and discharge power control method based on deep reinforcement learning is proposed. Firstly, the photovoltaic and energy storage hybrid system and the mathematical model of the hybrid system are briefly introduced, and the tracking control problem is defined. Then, power generation plans on different days are clustered into four scenarios by the K-means clustering algorithm. The mean, standard deviation, and kurtosis of the power generation plant are used as the features. Based on the clustered results, the state, action, and reward required for reinforcement learning are set. In the constraint conditions of various variables, to increase the accuracy of the hybrid system for tracking the new generation schedule, the proximal policy optimization (PPO) algorithm is used to optimize the charging/discharging power of the energy storage system (ESS). Finally, the proposed control method is applied to a photovoltaic power station. The results of several valid experiments indicate that the average errors of tracking using the Proportion Integral Differential (PID), model predictive control (MPC) method, and the PPO algorithm in the same condition are 0.374 MW, 0.609 MW, and 0.104 MW, respectively, and the computing time is 1.134 s, 2.760 s, and 0.053 s, respectively. The consequence of these indicates that the proposed deep reinforcement learning-based control strategy is more competitive than the traditional methods in terms of generalization and computation time.

Keywords: deep reinforcement learning; energy storage system; photovoltaic power output; schedule tracking control (search for similar items in EconPapers)
JEL-codes: Q Q0 Q4 Q40 Q41 Q42 Q43 Q47 Q48 Q49 (search for similar items in EconPapers)
Date: 2023
References: View references in EconPapers View complete reference list from CitEc
Citations:

Downloads: (external link)
https://www.mdpi.com/1996-1073/16/15/5840/pdf (application/pdf)
https://www.mdpi.com/1996-1073/16/15/5840/ (text/html)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:gam:jeners:v:16:y:2023:i:15:p:5840-:d:1212091

Access Statistics for this article

Energies is currently edited by Ms. Agatha Cao

More articles in Energies from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().