EconPapers    
Economics at your fingertips  
 

The Analysis of Multivariate Data Using Semi-Definite Programming

A.H. Al-Ibrahim ()

Journal of Classification, 2015, vol. 32, issue 3, 382-413

Abstract: A model is presented for analyzing general multivariate data. The model puts as its prime objective the dimensionality reduction of the multivariate problem. The only requirement of the model is that the input data to the statistical analysis be a covariance matrix, a correlation matrix, or more generally a positive semi-definite matrix. The model is parameterized by a scale parameter and a shape parameter both of which take on non-negative values smaller than unity. We first prove a wellknown heuristic for minimizing rank and establish the conditions under which rank can be replaced with trace. This result allows us to solve our rank minimization problem as a Semi-Definite Programming (SDP) problem by a number of available solvers. We then apply the model to four case studies dealing with four well-known problems in multivariate analysis. The first problem is to determine the number of underlying factors in factor analysis (FA) or the number of retained components in principal component analysis (PCA). It is shown that our model determines the number of factors or components more efficiently than the commonly used methods. The second example deals with a problem that has received much attention in recent years due to its wide applications, and it concerns sparse principal components and variable selection in PCA. When applied to a data set known in the literature as the pitprop data, we see that our approach yields PCs with larger variances than PCs derived from other approaches. The third problem concerns sensitivity analysis of the multivariate models, a topic not widely researched in the sequel due to its difficulty. Finally, we apply the model to a difficult problem in PCA known as lack of scale invariance in the solutions of PCA. This is the problem that the solutions derived from analyzing the covariance matrix in PCA are generally different (and not linearly related to) the solutions derived from analyzing the correlation matrix. Using our model, we obtain the same solution whether we analyze the correlation matrix or the covariance matrix since the analysis utilizes only the signs of the correlations/covariances but not their values. This is where we introduce a new type of PCA, called Sign PCA, which we speculate on its applications in social sciences and other fields of science. Copyright Classification Society of North America 2015

Keywords: FA; PCA; SDP; Multivariate statistical analysis; Positive semi-definite; Kernel methods; Kernel trick; Sign PCA (search for similar items in EconPapers)
Date: 2015
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (1)

Downloads: (external link)
http://hdl.handle.net/10.1007/s00357-015-9184-0 (text/html)
Access to full text is restricted to subscribers.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:spr:jclass:v:32:y:2015:i:3:p:382-413

Ordering information: This journal article can be ordered from
http://www.springer. ... hods/journal/357/PS2

DOI: 10.1007/s00357-015-9184-0

Access Statistics for this article

Journal of Classification is currently edited by Douglas Steinley

More articles in Journal of Classification from Springer, The Classification Society
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().

 
Page updated 2025-03-20
Handle: RePEc:spr:jclass:v:32:y:2015:i:3:p:382-413