Divide-and-Conquer: A Distributed Hierarchical Factor Approach to Modeling Large-Scale Time Series Data
Zhaoxing Gao and
Ruey S. Tsay
Journal of the American Statistical Association, 2023, vol. 118, issue 544, 2698-2711
Abstract:
This article proposes a hierarchical approximate-factor approach to analyzing high-dimensional, large-scale heterogeneous time series data using distributed computing. The new method employs a multiple-fold dimension reduction procedure using Principal Component Analysis (PCA) and shows great promises for modeling large-scale data that cannot be stored nor analyzed by a single machine. Each computer at the basic level performs a PCA to extract common factors among the time series assigned to it and transfers those factors to one and only one node of the second level. Each second-level computer collects the common factors from its subordinates and performs another PCA to select the second-level common factors. This process is repeated until the central server is reached, which collects factors from its direct subordinates and performs a final PCA to select the global common factors. The noise terms of the second-level approximate factor model are the unique common factors of the first-level clusters. We focus on the case of two levels in our theoretical derivations, but the idea can easily be generalized to any finite number of hierarchies, and the proposed method is also applicable to data with heterogeneous and multilevel subcluster structures that are stored and analyzed by a single machine. We introduce a new diffusion index approach to forecasting based on the global and group-specific factors. Some clustering methods are discussed in the supplement when the group memberships are unknown. We further extend the analysis to unit-root nonstationary time series. Asymptotic properties of the proposed method are derived for the diverging dimension of the data in each computing unit and the sample size T. We use both simulated and real examples to assess the performance of the proposed method in finite samples, and compare our method with the commonly used ones in the literature concerning the forecasting ability of extracted factors. Supplementary materials for this article are available online.
Date: 2023
References: Add references at CitEc
Citations:
Downloads: (external link)
http://hdl.handle.net/10.1080/01621459.2022.2071279 (text/html)
Access to full text is restricted to subscribers.
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:taf:jnlasa:v:118:y:2023:i:544:p:2698-2711
Ordering information: This journal article can be ordered from
http://www.tandfonline.com/pricing/journal/UASA20
DOI: 10.1080/01621459.2022.2071279
Access Statistics for this article
Journal of the American Statistical Association is currently edited by Xuming He, Jun Liu, Joseph Ibrahim and Alyson Wilson
More articles in Journal of the American Statistical Association from Taylor & Francis Journals
Bibliographic data for series maintained by Chris Longhurst ().