New Asymptotic Results in Principal Component Analysis
Vladimir Koltchinskii () and
Karim Lounici ()
Additional contact information
Vladimir Koltchinskii: Georgia Institute of Technology
Karim Lounici: Georgia Institute of Technology
Sankhya A: The Indian Journal of Statistics, 2017, vol. 79, issue 2, No 6, 254-297
Abstract:
Abstract Let X be a mean zero Gaussian random vector in a separable Hilbert space ℍ ${\mathbb H}$ with covariance operator Σ : = E ( X ⊗ X ) . ${\Sigma }:={\mathbb E}(X\otimes X).$ Let Σ = ∑ r ≥ 1 μ r P r ${\Sigma }={\sum }_{r\geq 1}\mu _{r} P_{r}$ be the spectral decomposition of Σ with distinct eigenvalues μ 1 > μ 2 > … $\mu _{1}>\mu _{2}> \dots $ and the corresponding spectral projectors P 1 , P 2 , … . $P_{1}, P_{2}, \dots .$ Given a sample X 1 , … , X n $X_{1},\dots , X_{n}$ of size n of i.i.d. copies of X, the sample covariance operator is defined as Σ ̂ n : = n − 1 ∑ j = 1 n X j ⊗ X j . $\hat {\Sigma }_{n} := n^{-1}{\sum }_{j=1}^{n} X_{j}\otimes X_{j}.$ The main goal of principal component analysis is to estimate spectral projectors P 1 , P 2 , … $P_{1}, P_{2}, \dots $ by their empirical counterparts P ̂ 1 , P ̂ 2 , … $\hat P_{1}, \hat P_{2}, \dots $ properly defined in terms of spectral decomposition of the sample covariance operator Σ ̂ n . $\hat {\Sigma }_{n}.$ The aim of this paper is to study asymptotic distributions of important statistics related to this problem, in particular, of statistic ∥ P ̂ r − P r ∥ 2 2 , $\|\hat P_{r}-P_{r}\|_{2}^{2},$ where ∥ ⋅ ∥ 2 2 $\|\cdot \|_{2}^{2}$ is the squared Hilbert–Schmidt norm. This is done in a “high-complexity” asymptotic framework in which the so called effective rank r ( Σ ) : = tr ( Σ ) ∥ Σ ∥ ∞ $\textbf {r}({\Sigma }):=\frac {\text {tr}({\Sigma })}{\|{\Sigma }\|_{\infty }}$ (tr(⋅) being the trace and ∥ ⋅ ∥ ∞ $\|\cdot \|_{\infty }$ being the operator norm) of the true covariance Σ is becoming large simultaneously with the sample size n, but r(Σ) = o(n) as n → ∞ . $n\to \infty .$ In this setting, we prove that, in the case of one-dimensional spectral projector P r , the properly centered and normalized statistic ∥ P ̂ r − P r ∥ 2 2 $\|\hat P_{r}-P_{r}\|_{2}^{2}$ with data-dependent centering and normalization converges in distribution to a Cauchy type limit. The proofs of this and other related results rely on perturbation analysis and Gaussian concentration.
Keywords: Sample covariance; Spectral projectors; Effective rank; Principal component analysis; Asymptotic distribution; Perturbation theory; Primary 62H25; 62H12; Secondary 60B20; 60G1. (search for similar items in EconPapers)
Date: 2017
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (2)
Downloads: (external link)
http://link.springer.com/10.1007/s13171-017-0106-6 Abstract (text/html)
Access to the full text of the articles in this series is restricted.
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:spr:sankha:v:79:y:2017:i:2:d:10.1007_s13171-017-0106-6
Ordering information: This journal article can be ordered from
http://www.springer.com/statistics/journal/13171
DOI: 10.1007/s13171-017-0106-6
Access Statistics for this article
Sankhya A: The Indian Journal of Statistics is currently edited by Dipak Dey
More articles in Sankhya A: The Indian Journal of Statistics from Springer, Indian Statistical Institute
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().