Deep reinforcement learning control for co-optimizing energy consumption, thermal comfort, and indoor air quality in an office building
Fangzhou Guo,
Sang woo Ham,
Donghun Kim and
Hyeun Jun Moon
Applied Energy, 2025, vol. 377, issue PA, No S0306261924018506
Abstract:
With the recent demand for decarbonization and energy efficiency, advanced HVAC control using Deep Reinforcement Learning (DRL) becomes a promising solution. Due to its flexible structures, DRL has been successful in energy reduction for many HVAC systems. However, only a few researches applied DRL agents to manage the entire central HVAC system and control multiple components in both the water loop and the air loop, owing to its complex system structures. Moreover, those researches have not extended their applications by incorporating the indoor air quality, especially both CO2 and PM2.5concentrations, on top of energy saving and thermal comfort, as achieving those objectives simultaneously can cause multiple control conflicts. What's more, DRL agents are usually trained on the simulation environment before deployment, so another challenge is to develop an accurate but relatively simple simulator. Therefore, we propose a DRL algorithm for a central HVAC system to co-optimize energy consumption, thermal comfort, indoor CO2 level, and indoor PM2.5 level in an office building. To train the controller, we also developed a hybrid simulator that decoupled the complex system into multiple simulation models, which are calibrated separately using laboratory test data. The hybrid simulator combined the dynamics of the HVAC system, the building envelope, as well as moisture, CO2, and particulate matter transfer. Three control algorithms (rule-based, MPC, and DRL) are developed, and their performances are evaluated on the hybrid simulator environment with a realistic scenario (i.e., with stochastic noises). The test results showed that, the DRL controller can save 21.4 % of energy compared to a rule-based controller, and has improved thermal comfort, reduced indoor CO2 concentration. The MPC controller showed an 18.6 % energy saving compared to the DRL controller, mainly due to savings from comfort and indoor air quality boundary violations caused by unmeasured disturbances, and it also highlights computational challenges in real-time control due to non-linear optimization. Finally, we provide the practical considerations for designing and implementing the DRL and MPC controllers based on their respective pros and cons.
Keywords: Deep reinforcement learning; Deep deterministic policy gradient; Smart building; Air conditioning; Energy efficiency; Thermal comfort; Indoor air quality (search for similar items in EconPapers)
Date: 2025
References: View references in EconPapers View complete reference list from CitEc
Citations:
Downloads: (external link)
http://www.sciencedirect.com/science/article/pii/S0306261924018506
Full text for ScienceDirect subscribers only
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:eee:appene:v:377:y:2025:i:pa:s0306261924018506
Ordering information: This journal article can be ordered from
http://www.elsevier.com/wps/find/journaldescription.cws_home/405891/bibliographic
http://www.elsevier. ... 405891/bibliographic
DOI: 10.1016/j.apenergy.2024.124467
Access Statistics for this article
Applied Energy is currently edited by J. Yan
More articles in Applied Energy from Elsevier
Bibliographic data for series maintained by Catherine Liu ().