Toward Large Energy Models: A comparative study of Transformers’ efficacy for energy forecasting

Gu, Yueyan; Jazizadeh, Farrokh; Wang, Xuan

Toward Large Energy Models: A comparative study of Transformers’ efficacy for energy forecasting

Yueyan Gu, Farrokh Jazizadeh and Xuan Wang

Applied Energy, 2025, vol. 384, issue C, No S0306261925000881

Abstract: Buildings’ significant contribution to global energy demand and emissions highlights the need for precise energy forecasting for effective management. Existing research on energy forecasting commonly focuses on specific target problems, such as individual buildings or small groups of buildings, leading to current challenges in data-driven forecasting, including dependence on data quality and quantity, limited generalizability, and computational inefficiency. To address these challenges, Generalized Energy Models (GEMs) for energy forecasting can potentially be developed using large-scale datasets. Transformers, known for their scalability, ability to capture long-term dependencies and efficiency in parallel processing of large datasets, are considered good candidates for GEMs. In this study, we tested the hypothesis that GEMs can be efficiently developed to outperform in-situ (i.e., building-specific) models trained solely on data from individual buildings. To this end, we investigated and compared three candidate multivariate Transformer architectures, utilizing both zero-shot and fine-tuning strategies, with data from 1,014 buildings. The results, evaluated across three prediction horizons (24, 72, and 168 h), confirm that GEMs significantly outperform Transformer-based in-situ models. Fine-tuned GEMs showed performance improvements of up to 28% in MSE and reduced training time by 55%. Besides Transformer-based in-situ models, GEMs outperformed several state-of-the-art non-Transformer deep learning baseline models in both effectiveness and efficiency. We further explored a number of questions, including the required data size for effective fine-tuning, as well as the impact of input sub-sequence length and pre-training dataset size on GEMs’ performance. The findings show a statistically significant performance boost from using larger pre-training datasets, highlighting the potential for larger GEMs using web-scale global data to move toward Large Energy Models (LEM).

Keywords: Energy forecasting; Transformer models; Generalizability; Scalability; Large model; Multivariate time series; Foundation models (search for similar items in EconPapers)
Date: 2025
References: Add references at CitEc
Citations:

Downloads: (external link)
http://www.sciencedirect.com/science/article/pii/S0306261925000881
Full text for ScienceDirect subscribers only

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:eee:appene:v:384:y:2025:i:c:s0306261925000881

Ordering information: This journal article can be ordered from
http://www.elsevier.com/wps/find/journaldescription.cws_home/405891/bibliographic
http://www.elsevier. ... 405891/bibliographic

DOI: 10.1016/j.apenergy.2025.125358

Access Statistics for this article

Applied Energy is currently edited by J. Yan

More articles in Applied Energy from Elsevier
Bibliographic data for series maintained by Catherine Liu ().