Data-driven estimation of building energy consumption with multi-source heterogeneous data
Yue Pan and
Limao Zhang
Applied Energy, 2020, vol. 268, issue C, No S0306261920304773
Abstract:
For better energy evaluation and management, a categorical boosting (CatBoost)-based predictive method is presented to accurately estimate building energy consumption by learning large volumes of multi-source heterogeneous data collected from buildings. To be specific, the newly-developed CatBoost model belonging to the ensemble learning has superiority in handling categorical variables and producing reliable results. As a case study, our proposed method is validated in a multi-dimensional dataset about Seattle's building energy performance provided by the city’s government, aiming to estimate the weather normalized site energy use intensity of buildings and characterize its non-linear relationship with other 12 possible influential features. Results from the 5-fold cross-validation demonstrate that the model exhibits a strong ability in predicting the exact value of energy intensity precisely, which can even outperform popular machine learning algorithms including random forest and gradient boosting decision tree under R2 of 0.897. Based on a defined threshold, these predicted values can be classified as the normal or abnormal energy consumption reaching an accuracy of 99.32% for outlier detection, which is helpful in alarming potential risks at an early stage and developing strategies to enhance the energy efficiency. Moreover, results from the established model can be interpreted objectively, suggesting that features concerning the physical and energy characteristics contribute more to energy estimation than environmental features. Since such results understand the building energy consumption and efficiency in a data-driven manner, they can eventually serve as guidance for building owners and designers in designing and renovating buildings to achieve better energy-conserving performance.
Keywords: Building energy estimation; Data mining; Categorical boosting (CatBoost) model; Feature importance (search for similar items in EconPapers)
Date: 2020
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (26)
Downloads: (external link)
http://www.sciencedirect.com/science/article/pii/S0306261920304773
Full text for ScienceDirect subscribers only
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:eee:appene:v:268:y:2020:i:c:s0306261920304773
Ordering information: This journal article can be ordered from
http://www.elsevier.com/wps/find/journaldescription.cws_home/405891/bibliographic
http://www.elsevier. ... 405891/bibliographic
DOI: 10.1016/j.apenergy.2020.114965
Access Statistics for this article
Applied Energy is currently edited by J. Yan
More articles in Applied Energy from Elsevier
Bibliographic data for series maintained by Catherine Liu ().