Multi-Agent Deep Reinforcement Learning for Scheduling of Energy Storage System in Microgrids

Jung, Sang-Woo; An, Yoon-Young; Suh, BeomKyu; Park, YongBeom; Kim, Jian; Kim, Ki-Il

Multi-Agent Deep Reinforcement Learning for Scheduling of Energy Storage System in Microgrids

Sang-Woo Jung, Yoon-Young An, BeomKyu Suh, YongBeom Park, Jian Kim and Ki-Il Kim ()
Additional contact information
Sang-Woo Jung: Department of Computer Science and Engineering, Chungnam National University, 99 Daehak-ro, Yuseong-gu, Daejeon 34143, Republic of Korea
Yoon-Young An: ICT Convergence Standards Research Division, Electronics and Telecommunications Research Institute, 218 Gajeong-ro, Yuseong-gu, Daejeon 34129, Republic of Korea
BeomKyu Suh: Department of Computer Science and Engineering, Chungnam National University, 99 Daehak-ro, Yuseong-gu, Daejeon 34143, Republic of Korea
YongBeom Park: Department of Computer Science and Engineering, Chungnam National University, 99 Daehak-ro, Yuseong-gu, Daejeon 34143, Republic of Korea
Jian Kim: Department of Computer Science and Engineering, Chungnam National University, 99 Daehak-ro, Yuseong-gu, Daejeon 34143, Republic of Korea
Ki-Il Kim: Department of Computer Science and Engineering, Chungnam National University, 99 Daehak-ro, Yuseong-gu, Daejeon 34143, Republic of Korea

Mathematics, 2025, vol. 13, issue 12, 1-24

Abstract: Efficient scheduling of Energy Storage Systems (ESS) within microgrids has emerged as a critical issue to ensure energy cost reduction, peak shaving, and battery health management. For ESS scheduling, both single-agent and multi-agent deep reinforcement learning (DRL) approaches have been explored. However, the former has suffered from scalability to include multiple objectives while the latter lacks comprehensive consideration of diverse user objectives. To defeat the above issues, in this paper, we propose a new DRL-based scheduling algorithm using a multi-agent proximal policy optimization (MAPPO) framework that is combined with Pareto optimization. The proposed model employs two independent agents: one is to minimize electricity costs and the other does charge/discharge switching frequency to account for battery degradation. The candidate actions generated by the agents are evaluated through Pareto dominance, and the final action is selected via scalarization-reflecting operator-defined preferences. The simulation experiments were conducted using real industrial building load and photovoltaic (PV) generation data under realistic South Korean electricity tariff structures. The comparative evaluations against baseline DRL algorithms (TD3, SAC, PPO) demonstrate that the proposed MAPPO method significantly reduces electricity costs while minimizing battery-switching events. Furthermore, the results highlight that the proposed method achieves a balanced improvement in both economic efficiency and battery longevity, making it highly applicable to real-world dynamic microgrid environments. Specifically, the proposed MAPPO-based scheduling achieved a total electricity cost reduction of 14.68% compared to the No-ESS case and achieved 3.56% greater cost savings than other baseline reinforcement learning algorithms.

Keywords: energy storage system; scheduling; multi-agent; deep reinforcement learning; multi objective optimization; Pareto optimization (search for similar items in EconPapers)
JEL-codes: C (search for similar items in EconPapers)
Date: 2025
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (1)

Downloads: (external link)
https://www.mdpi.com/2227-7390/13/12/1999/pdf (application/pdf)
https://www.mdpi.com/2227-7390/13/12/1999/ (text/html)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:gam:jmathe:v:13:y:2025:i:12:p:1999-:d:1680966

Access Statistics for this article

Mathematics is currently edited by Ms. Emma He

More articles in Mathematics from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().