Ultra-high dimensional variable screening via Gram–Schmidt orthogonalization
Huiwen Wang,
Ruiping Liu,
Shanshan Wang (),
Zhichao Wang and
Gilbert Saporta
Additional contact information
Huiwen Wang: Beihang University
Ruiping Liu: Beihang University
Shanshan Wang: Beihang University
Zhichao Wang: Beihang University
Gilbert Saporta: Conservatoire National des Arts et Métiers
Computational Statistics, 2020, vol. 35, issue 3, No 10, 1153-1170
Abstract:
Abstract Independence screening procedure plays a vital role in variable selection when the number of variables is massive. However, high dimensionality of the data may bring in many challenges, such as multicollinearity or high correlation (possibly spurious) between the covariates, which results in marginal correlation being unreliable as a measure of association between the covariates and the response. We propose a novel and simple screening procedure called Gram–Schmidt screening (GSS) by integrating the classical Gram–Schmidt orthogonalization and the sure independence screening technique, which takes into account high correlations between the covariates in a data-driven way. GSS could successfully discriminate between the relevant and the irrelevant variables to achieve a high true positive rate without including many irrelevant and redundant variables, which offers a new perspective for screening method when the covariates are highly correlated. The practical performance of GSS was shown by comparative simulation studies and analysis of two real datasets.
Keywords: Variable selection; High correlation; High dimensionality; Screening procedure (search for similar items in EconPapers)
Date: 2020
References: View references in EconPapers View complete reference list from CitEc
Citations:
Downloads: (external link)
http://link.springer.com/10.1007/s00180-020-00963-7 Abstract (text/html)
Access to the full text of the articles in this series is restricted.
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:spr:compst:v:35:y:2020:i:3:d:10.1007_s00180-020-00963-7
Ordering information: This journal article can be ordered from
http://www.springer.com/statistics/journal/180/PS2
DOI: 10.1007/s00180-020-00963-7
Access Statistics for this article
Computational Statistics is currently edited by Wataru Sakamoto, Ricardo Cao and Jürgen Symanzik
More articles in Computational Statistics from Springer
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().