EconPapers    
Economics at your fingertips  
 

Decentralized Multi-Agent Reinforcement Learning Control of Residential Battery Storage for Demand Response

Suhaib Sajid (), Bin Li, Badia Berehman, Qi Guo, Yi Kang, Muhammad Athar and Ali Muqtadir ()
Additional contact information
Suhaib Sajid: School of Electrical and Electronic Engineering, North China Electric Power University, Beijing 102206, China
Bin Li: School of Electrical and Electronic Engineering, North China Electric Power University, Beijing 102206, China
Badia Berehman: College of Telecommunications and Information Engineering, Nanjing University of Post and Telecommunications, Nanjing 210049, China
Qi Guo: Digital Research Branch (Digital Research Institute), Inner Mongolia Power (Group) Company Limited, Hohhot 010010, China
Yi Kang: Digital Research Branch (Digital Research Institute), Inner Mongolia Power (Group) Company Limited, Hohhot 010010, China
Muhammad Athar: School of Electrical and Electronic Engineering, North China Electric Power University, Beijing 102206, China
Ali Muqtadir: School of Electrical and Electronic Engineering, North China Electric Power University, Beijing 102206, China

Energies, 2025, vol. 18, issue 21, 1-28

Abstract: Automated demand response in residential sectors is critical for grid stability, but centralized control strategies fail to address the unique energy profiles of individual households. This paper introduces a decentralized control framework using multi-agent deep reinforcement learning. We assign an independent Soft Actor–Critic (SAC) agent to each building’s battery energy storage system (BESS), enabling it to learn a control policy tailored to local conditions while responding to shared grid signals. Evaluated in a high-fidelity simulation environment of CityLearn using real-world data, our multi-agent system demonstrated a reduction of approximately 50% in both electricity costs and carbon emissions. Crucially, this decentralized approach considerably outperformed all benchmarks, including a rule-based controller, tabular Q-learning, and even a centralized single-agent SAC controller. At the district level, learned policies flatten the net load profile, lowering daily peaks by 16% and ramping by 26%, and improve the load factor. The resulting dispatch patterns are interpretable and consistent with operator objectives such as peak shaving and valley filling. These findings indicate that decentralized reinforcement learning can translate local optimization into system-level benefits and offers a scalable pathway for aggregators and utilities to operationalize the flexibility of residential storage at scale.

Keywords: energy optimization; decentralized control; demand response; multi-agent; reinforcement learning; residential buildings (search for similar items in EconPapers)
JEL-codes: Q Q0 Q4 Q40 Q41 Q42 Q43 Q47 Q48 Q49 (search for similar items in EconPapers)
Date: 2025
References: Add references at CitEc
Citations:

Downloads: (external link)
https://www.mdpi.com/1996-1073/18/21/5712/pdf (application/pdf)
https://www.mdpi.com/1996-1073/18/21/5712/ (text/html)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:gam:jeners:v:18:y:2025:i:21:p:5712-:d:1783379

Access Statistics for this article

Energies is currently edited by Ms. Cassie Shen

More articles in Energies from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().

 
Page updated 2025-10-31
Handle: RePEc:gam:jeners:v:18:y:2025:i:21:p:5712-:d:1783379