CRISP-DM-Based Data-Driven Approach for Building Energy Prediction Utilizing Indoor and Environmental Factors
Moaaz Elkabalawy (),
Abobakr Al-Sakkaf (),
Eslam Mohammed Abdelkader and
Ghasan Alfalah
Additional contact information
Moaaz Elkabalawy: Department of Building, Civil, and Environmental Engineering, Concordia University, Montreal, QC H3G 1M8, Canada
Abobakr Al-Sakkaf: Department of Building, Civil, and Environmental Engineering, Concordia University, Montreal, QC H3G 1M8, Canada
Eslam Mohammed Abdelkader: Structural Engineering Department, Faculty of Engineering, Cairo University, Giza 12613, Egypt
Ghasan Alfalah: Department of Architecture and Building Science, College of Architecture and Planning, King Saud University, Riyadh 11362, Saudi Arabia
Sustainability, 2024, vol. 16, issue 17, 1-21
Abstract:
The significant energy consumption associated with the built environment demands comprehensive energy prediction modelling. Leveraging their ability to capture intricate patterns without extensive domain knowledge, supervised data-driven approaches present a marked advantage in adaptability over traditional physical-based building energy models. This study employs various machine learning models to predict energy consumption for an office building in Berkeley, California. To enhance the accuracy of these predictions, different feature selection techniques, including principal component analysis (PCA), decision tree regression (DTR), and Pearson correlation analysis, were adopted to identify key attributes of energy consumption and address collinearity. The analyses yielded nine influential attributes: heating, ventilation, and air conditioning (HVAC) system operating parameters, indoor and outdoor environmental parameters, and occupancy. To overcome missing occupancy data in the datasets, we investigated the possibility of occupancy-based Wi-Fi prediction using different machine learning algorithms. The results of the occupancy prediction modelling indicate that Wi-Fi can be used with acceptable accuracy in predicting occupancy count, which can be leveraged to analyze occupant comfort and enhance the accuracy of building energy models. Six machine learning models were tested for energy prediction using two different datasets: one before and one after occupancy prediction. Using a 10-fold cross-validation with an 8:2 training-to-testing ratio, the Random Forest algorithm emerged superior, exhibiting the highest R 2 value of 0.92 and the lowest RMSE of 3.78 when occupancy data were included. Additionally, an error propagation analysis was conducted to assess the impact of the occupancy-based Wi-Fi prediction model’s error on the energy prediction model. The results indicated that Wi-Fi-based occupancy prediction can improve the data inputs for building energy models, leading to more accurate energy consumption predictions. The findings underscore the potential of integrating the developed energy prediction models with fault detection systems, model predictive controllers, and energy load shape analysis, ultimately enhancing energy management practices.
Keywords: smart buildings; sustainability; data mining; data-driven energy prediction models; CRISP-DM; supervised machine learning; occupancy prediction; feature selection analysis (search for similar items in EconPapers)
JEL-codes: O13 Q Q0 Q2 Q3 Q5 Q56 (search for similar items in EconPapers)
Date: 2024
References: View references in EconPapers View complete reference list from CitEc
Citations:
Downloads: (external link)
https://www.mdpi.com/2071-1050/16/17/7249/pdf (application/pdf)
https://www.mdpi.com/2071-1050/16/17/7249/ (text/html)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:gam:jsusta:v:16:y:2024:i:17:p:7249-:d:1462307
Access Statistics for this article
Sustainability is currently edited by Ms. Alexandra Wu
More articles in Sustainability from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().