EconPapers    
Economics at your fingertips  
 

A high-dimensional two-sample test for the mean using random subspaces

Måns Thulin

Computational Statistics & Data Analysis, 2014, vol. 74, issue C, 26-38

Abstract: A common problem in genetics is that of testing whether a set of highly dependent gene expressions differ between two populations, typically in a high-dimensional setting where the data dimension is larger than the sample size. Most high-dimensional tests for the equality of two mean vectors rely on naive diagonal or trace estimators of the covariance matrix, ignoring dependences between variables. A test using random subspaces is proposed, which offers higher power when the variables are dependent and is invariant under linear transformations of the marginal distributions. The p-values for the test are obtained using permutations. The test does not rely on assumptions about normality or the structure of the covariance matrix. It is shown by simulation that the new test has higher power than competing tests in realistic settings motivated by microarray gene expression data. Computational aspects of high-dimensional permutation tests are also discussed and an efficient R implementation of the proposed test is provided.

Keywords: Gene expression data; Gene-set testing; High-dimensional data; Random subspace; Test about the mean; Two-sample problem (search for similar items in EconPapers)
Date: 2014
References: Add references at CitEc
Citations: View citations in EconPapers (15)

Downloads: (external link)
http://www.sciencedirect.com/science/article/pii/S0167947313004726
Full text for ScienceDirect subscribers only.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:eee:csdana:v:74:y:2014:i:c:p:26-38

DOI: 10.1016/j.csda.2013.12.003

Access Statistics for this article

Computational Statistics & Data Analysis is currently edited by S.P. Azen

More articles in Computational Statistics & Data Analysis from Elsevier
Bibliographic data for series maintained by Catherine Liu (repec@elsevier.com).

 
Page updated 2024-12-28
Handle: RePEc:eee:csdana:v:74:y:2014:i:c:p:26-38