Sparse spatial–temporal attention forecasting network: A new model for time series forecasting
Mi Wen,
Junjie Huan,
Minjie Wei,
Yun Su and
Naiwang Guo
Additional contact information
Mi Wen: Shanghai University of Electric Power, Shanghai, P. R. China
Junjie Huan: Shanghai University of Electric Power, Shanghai, P. R. China
Minjie Wei: Shanghai University of Electric Power, Shanghai, P. R. China
Yun Su: State Grid Electric Power Research Institute, Nanjing, P. R. China
Naiwang Guo: State Grid Electric Power Research Institute, Nanjing, P. R. China
International Journal of Modern Physics C (IJMPC), 2025, vol. 36, issue 07, 1-21
Abstract:
Time series forecasting has practical applications in many fields such as power load forecasting and traffic flow prediction. However, current time series forecasting methods focus more on temporal correlations, with less consideration given to the impact of spatial dimensions on time series predictions. Today, complex spatial–temporal coupling relationships significantly influence time series forecasting. Existing works mostly model time series predictions purely based on temporal correlations, without modeling spatial–temporal correlations, leading to insufficient prediction accuracy in time series forecasting. Inspired by the recent success of Transformers in the graph domain and their successful application in forecasting, we propose a new model for time series prediction called the Sparse Spatial–Temporal Attention Forecasting Network. This network performs adaptive spatial–temporal attention modeling on spatial–temporal graphs and incorporates a sparsity measure function, MLP, and average pooling into the traditional Transformer’s multi-head self-attention mechanism to better weight the attention scores. To improve prediction accuracy, the Sparse Spatial–Temporal Attention Forecasting Network constructs a weighted adjacency matrix based on the topological relationships between spatially adjacent nodes and information such as latitude and longitude. It then builds an adaptive graph using this weighted adjacency matrix to capture hidden spatial–temporal coupling relationships. Simultaneously, it employs a sequence decomposition algorithm based on average pooling to divide the data into two main blocks for separate prediction, thereby enhancing prediction accuracy. The Sparse Spatial–Temporal Attention Forecasting Network’s superior performance is demonstrated through experimental results on the Shanghai Pudong power load dataset and the public dataset PEMS-BAY.
Keywords: Time series forecasting; Transformer; multi-head self-attention mechanism; deep learning; sequence decomposition (search for similar items in EconPapers)
Date: 2025
References: Add references at CitEc
Citations:
Downloads: (external link)
http://www.worldscientific.com/doi/abs/10.1142/S0129183124502589
Access to full text is restricted to subscribers
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:wsi:ijmpcx:v:36:y:2025:i:07:n:s0129183124502589
Ordering information: This journal article can be ordered from
DOI: 10.1142/S0129183124502589
Access Statistics for this article
International Journal of Modern Physics C (IJMPC) is currently edited by H. J. Herrmann
More articles in International Journal of Modern Physics C (IJMPC) from World Scientific Publishing Co. Pte. Ltd.
Bibliographic data for series maintained by Tai Tone Lim ().