EconPapers    
Economics at your fingertips  
 

Perturbation theory for cross data matrix-based PCA

Shao-Hsuan Wang and Su-Yun Huang

Journal of Multivariate Analysis, 2022, vol. 190, issue C

Abstract: Principal component analysis (PCA) has long been a useful and important tool for dimension reduction. However, this method must be used with care under certain circumstances such as high dimension and small sample size. In general, low dimension with large sample size or large signal to noise ratio is vital to guarantee the consistency of the leading eigenvalues and eigenvectors obtained by PCA. Cross data matrix (CDM)-based PCA is another way to estimate PCA components, through splitting data into two subsets and calculating singular value decomposition for the cross product of the corresponding covariance matrices. It has been shown that CDM-based PCA has a broader region of consistency than ordinary PCA for leading eigenvalues and eigenvectors. Although the difference in regions of consistency is well studied, an interesting practical as well as theoretical question is how they differ in eigenvalues and eigenvectors estimation, especially for the case where both fall in a common region of consistency. In this article, we derive the finite sample approximation results as well as the asymptotic behavior for CDM-based PCA via matrix perturbation. Furthermore, we also derive a comparison measure for CDM-based PCA vs. ordinary PCA. This measure only depends on the data dimension, noise correlations and the noise-to-signal ratio (NSR). Using this measure, we develop an algorithm, which selects good partitions and integrates results from these good partitions to form a final estimate for CDM-based PCA. Numerical and real data examples are presented for illustration.

Keywords: Cross data matrix; Finite sample approximation; High dimension and low sample size; Matrix perturbation; Principal component analysis; Spiked covariance model (search for similar items in EconPapers)
Date: 2022
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (1)

Downloads: (external link)
http://www.sciencedirect.com/science/article/pii/S0047259X22000082
Full text for ScienceDirect subscribers only

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:eee:jmvana:v:190:y:2022:i:c:s0047259x22000082

Ordering information: This journal article can be ordered from
http://www.elsevier.com/wps/find/supportfaq.cws_home/regional
https://shop.elsevie ... _01_ooc_1&version=01

DOI: 10.1016/j.jmva.2022.104960

Access Statistics for this article

Journal of Multivariate Analysis is currently edited by de Leeuw, J.

More articles in Journal of Multivariate Analysis from Elsevier
Bibliographic data for series maintained by Catherine Liu ().

 
Page updated 2025-03-19
Handle: RePEc:eee:jmvana:v:190:y:2022:i:c:s0047259x22000082