Dual-Layer Q-Learning Strategy for Energy Management of Battery Storage in Grid-Connected Microgrids
Khawaja Haider Ali (),
Mohammad Abusara (),
Asif Ali Tahir and
Saptarshi Das
Additional contact information
Khawaja Haider Ali: Faculty of Environment, Science and Economy, University of Exeter, Penryn Campus, Cornwall TR10 9FE, UK
Mohammad Abusara: Faculty of Environment, Science and Economy, University of Exeter, Penryn Campus, Cornwall TR10 9FE, UK
Asif Ali Tahir: Faculty of Environment, Science and Economy, University of Exeter, Penryn Campus, Cornwall TR10 9FE, UK
Saptarshi Das: Faculty of Environment, Science and Economy, University of Exeter, Penryn Campus, Cornwall TR10 9FE, UK
Energies, 2023, vol. 16, issue 3, 1-17
Abstract:
Real-time energy management of battery storage in grid-connected microgrids can be very challenging due to the intermittent nature of renewable energy sources (RES), load variations, and variable grid tariffs. Two reinforcement learning (RL)–based energy management systems have been previously used, namely, offline and online methods. In offline RL, the agent learns the optimum policy using forecasted generation and load data. Once the convergence is achieved, battery commands are dispatched in real time. The performance of this strategy highly depends on the accuracy of the forecasted data. An agent in online RL learns the best policy by interacting with the system in real time using real data. Online RL deals better with the forecasted error but can take a longer time to converge. This paper proposes a novel dual layer Q -learning strategy to address this challenge. The first (upper) layer is conducted offline to produce directive commands for the battery system for a 24 h horizon. It uses forecasted data for generation and load. The second (lower) Q -learning-based layer refines these battery commands every 15 min by considering the changes happening in the RES and load demand in real time. This decreases the overall operating cost of the microgrid as compared with online RL by reducing the convergence time. The superiority of the proposed strategy (dual-layer RL) has been verified by simulation results after comparing it with individual offline and online RL algorithms.
Keywords: reinforcement learning (RL); microgrid; energy management; offline and online RL; dual-layer Q-learning (search for similar items in EconPapers)
JEL-codes: Q Q0 Q4 Q40 Q41 Q42 Q43 Q47 Q48 Q49 (search for similar items in EconPapers)
Date: 2023
References: View references in EconPapers View complete reference list from CitEc
Citations:
Downloads: (external link)
https://www.mdpi.com/1996-1073/16/3/1334/pdf (application/pdf)
https://www.mdpi.com/1996-1073/16/3/1334/ (text/html)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:gam:jeners:v:16:y:2023:i:3:p:1334-:d:1048005
Access Statistics for this article
Energies is currently edited by Ms. Agatha Cao
More articles in Energies from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().