A Multi-Agent Reinforcement Learning Approach to Price and Comfort Optimization in HVAC-Systems

Blad, Christian; Bøgh, Simon; Kallesøe, Carsten

A Multi-Agent Reinforcement Learning Approach to Price and Comfort Optimization in HVAC-Systems

Christian Blad, Simon Bøgh and Carsten Kallesøe
Additional contact information
Christian Blad: Robotics & Automation Group, Department of Materials and Production, Aalborg University, 9220 Aalborg, Denmark
Simon Bøgh: Robotics & Automation Group, Department of Materials and Production, Aalborg University, 9220 Aalborg, Denmark
Carsten Kallesøe: Technology and Innovation, Control Department, Grundfos, 8850 Bjerringbro, Denmark

Energies, 2021, vol. 14, issue 22, 1-20

Abstract: This paper addresses the challenge of minimizing training time for the control of Heating, Ventilation, and Air-conditioning (HVAC) systems with online Reinforcement Learning (RL). This is done by developing a novel approach to Multi-Agent Reinforcement Learning (MARL) to HVAC systems. In this paper, the environment formed by the HVAC system is formulated as a Markov Game (MG) in a general sum setting. The MARL algorithm is designed in a decentralized structure, where only relevant states are shared between agents, and actions are shared in a sequence, which are sensible from a system’s point of view. The simulation environment is a domestic house located in Denmark and designed to resemble an average house. The heat source in the house is an air-to-water heat pump, and the HVAC system is an Underfloor Heating system (UFH). The house is subjected to weather changes from a data set collected in Copenhagen in 2006, spanning the entire year except for June, July, and August, where heat is not required. It is shown that: (1) When comparing Single Agent Reinforcement Learning (SARL) and MARL, training time can be reduced by 70% for a four temperature-zone UFH system, (2) the agent can learn and generalize over seasons, (3) the cost of heating can be reduced by 19% or the equivalent to 750 kWh of electric energy per year for an average Danish domestic house compared to a traditional control method, and (4) oscillations in the room temperature can be reduced by 40% when comparing the RL control methods with a traditional control method.

Keywords: deep reinforcement learning; artificial intelligence; HVAC-systems; underfloor heating; energy in buildings; predictive analytics (search for similar items in EconPapers)
JEL-codes: Q Q0 Q4 Q40 Q41 Q42 Q43 Q47 Q48 Q49 (search for similar items in EconPapers)
Date: 2021
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (3)

Downloads: (external link)
https://www.mdpi.com/1996-1073/14/22/7491/pdf (application/pdf)
https://www.mdpi.com/1996-1073/14/22/7491/ (text/html)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:gam:jeners:v:14:y:2021:i:22:p:7491-:d:675495

Access Statistics for this article

Energies is currently edited by Ms. Agatha Cao

More articles in Energies from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().