EconPapers    
Economics at your fingertips  
 

Comparison of penalty functions for sparse canonical correlation analysis

Prabhakar Chalise and Brooke L. Fridley

Computational Statistics & Data Analysis, 2012, vol. 56, issue 2, 245-254

Abstract: Canonical correlation analysis (CCA) is a widely used multivariate method for assessing the association between two sets of variables. However, when the number of variables far exceeds the number of subjects, such in the case of large-scale genomic studies, the traditional CCA method is not appropriate. In addition, when the variables are highly correlated, the sample covariance matrices become unstable or undefined. To overcome these two issues, sparse canonical correlation analysis (SCCA) for multiple data sets has been proposed using a Lasso type of penalty. However, these methods do not have direct control over the sparsity of the solution. An additional step that uses a Bayesian Information Criterion (BIC) has also been suggested to further filter out unimportant features. In this paper, a comparison of four penalty functions (Lasso, Elastic-net, smoothly clipped absolute deviation (SCAD), and Hard-threshold) for SCCA with and without the BIC filtering step have been carried out using both real and simulated genotypic and mRNA expression data. This study indicates that the SCAD penalty with a BIC filter would be a preferable penalty function for application of SCCA to genomic data.

Keywords: SCCA; Lasso; Elastic-net; SCAD; BIC; Penalty; SNP; mRNA expression (search for similar items in EconPapers)
Date: 2012
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (3)

Downloads: (external link)
http://www.sciencedirect.com/science/article/pii/S0167947311002660
Full text for ScienceDirect subscribers only.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:eee:csdana:v:56:y:2012:i:2:p:245-254

DOI: 10.1016/j.csda.2011.07.012

Access Statistics for this article

Computational Statistics & Data Analysis is currently edited by S.P. Azen

More articles in Computational Statistics & Data Analysis from Elsevier
Bibliographic data for series maintained by Catherine Liu ().

 
Page updated 2025-03-19
Handle: RePEc:eee:csdana:v:56:y:2012:i:2:p:245-254