EconPapers    
Economics at your fingertips  
 

Selecting the number of components in principal component analysis using cross-validation approximations

Julie Josse and François Husson

Computational Statistics & Data Analysis, 2012, vol. 56, issue 6, 1869-1879

Abstract: Cross-validation is a tried and tested approach to select the number of components in principal component analysis (PCA), however, its main drawback is its computational cost. In a regression (or in a non parametric regression) setting, criteria such as the general cross-validation one (GCV) provide convenient approximations to leave-one-out cross-validation. They are based on the relation between the prediction error and the residual sum of squares weighted by elements of a projection matrix (or a smoothing matrix). Such a relation is then established in PCA using an original presentation of PCA with a unique projection matrix. It enables the definition of two cross-validation approximation criteria: the smoothing approximation of the cross-validation criterion (SACV) and the GCV criterion. The method is assessed with simulations and gives promising results.

Keywords: PCA; Number of components; Cross-validation; Smoothing matrix; Generalized cross-validation (search for similar items in EconPapers)
Date: 2012
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (12)

Downloads: (external link)
http://www.sciencedirect.com/science/article/pii/S0167947311004099
Full text for ScienceDirect subscribers only.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:eee:csdana:v:56:y:2012:i:6:p:1869-1879

DOI: 10.1016/j.csda.2011.11.012

Access Statistics for this article

Computational Statistics & Data Analysis is currently edited by S.P. Azen

More articles in Computational Statistics & Data Analysis from Elsevier
Bibliographic data for series maintained by Catherine Liu ().

 
Page updated 2025-03-19
Handle: RePEc:eee:csdana:v:56:y:2012:i:6:p:1869-1879