EconPapers    
Economics at your fingertips  
 

Safe Reinforcement Learning for Buildings: Minimizing Energy Use While Maximizing Occupant Comfort

Mohammad Esmaeili, Sascha Hammes, Samuele Tosatto, David Geisler-Moroder () and Philipp Zech
Additional contact information
Mohammad Esmaeili: Department of Computer Science, University of Innsbruck, 6020 Innsbruck, Austria
Sascha Hammes: Unit of Energy Efficient Building, University of Innsbruck, 6020 Innsbruck, Austria
Samuele Tosatto: Department of Computer Science, University of Innsbruck, 6020 Innsbruck, Austria
David Geisler-Moroder: Unit of Energy Efficient Building, University of Innsbruck, 6020 Innsbruck, Austria
Philipp Zech: Department of Computer Science, University of Innsbruck, 6020 Innsbruck, Austria

Energies, 2025, vol. 18, issue 19, 1-34

Abstract: With buildings accounting for 40% of global energy consumption, heating, ventilation, and air conditioning (HVAC) systems represent the single largest opportunity for emissions reduction, consuming up to 60% of commercial building energy while maintaining occupant comfort. This critical balance between energy efficiency and human comfort has traditionally relied on rule-based and model predictive control strategies. Given the multi-objective nature and complexity of modern HVAC systems, these approaches fall short in satisfying both objectives. Recently, reinforcement learning (RL) has emerged as a method capable of learning optimal control policies directly from system interactions without requiring explicit models. However, standard RL approaches frequently violate comfort constraints during exploration, making them unsuitable for real-world deployment where occupant comfort cannot be compromised. This paper addresses two fundamental challenges in HVAC control: the difficulty of constrained optimization in RL and the challenge of defining appropriate comfort constraints across diverse conditions. We adopt a safe RL with a neural barrier certificate framework that (1) transforms the constrained HVAC problem into an unconstrained optimization and (2) constructs these certificates in a data-driven manner using neural networks, adapting to building-specific comfort patterns without manual threshold setting. This approach enables the agent to almost guarantee solutions that improve energy efficiency and ensure defined comfort limits. We validate our approach through seven experiments spanning residential and commercial buildings, from single-zone heat pump control to five-zone variable air volume (VAV) systems. Our safe RL framework achieves energy reduction compared to baseline operation while maintaining higher comfort compliance than unconstrained RL. The data-driven barrier construction discovers building-specific comfort patterns, enabling context-aware optimization impossible with fixed thresholds. While neural approximation prevents absolute safety guarantees, reducing catastrophic safety failures compared to unconstrained RL while maintaining adaptability positions this approach as a developmental bridge between RL theory and real-world building automation, though the considerable gap in both safety and energy performance relative to rule-based control indicates the method requires substantial improvement for practical deployment.

Keywords: HVAC system; reinforcement learning; safe reinforcement learning; neural barrier certificates (search for similar items in EconPapers)
JEL-codes: Q Q0 Q4 Q40 Q41 Q42 Q43 Q47 Q48 Q49 (search for similar items in EconPapers)
Date: 2025
References: Add references at CitEc
Citations:

Downloads: (external link)
https://www.mdpi.com/1996-1073/18/19/5313/pdf (application/pdf)
https://www.mdpi.com/1996-1073/18/19/5313/ (text/html)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:gam:jeners:v:18:y:2025:i:19:p:5313-:d:1767087

Access Statistics for this article

Energies is currently edited by Ms. Cassie Shen

More articles in Energies from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().

 
Page updated 2025-10-10
Handle: RePEc:gam:jeners:v:18:y:2025:i:19:p:5313-:d:1767087