Grouped variable screening for ultra-high dimensional data for linear model
Debin Qiu and
Jeongyoun Ahn
Computational Statistics & Data Analysis, 2020, vol. 144, issue C
Abstract:
Ultra-high dimensional data sets often need a screening step that removes irrelevant variables prior to the main analysis. In high-dimensional linear regression, screening relevant predictors before the model estimation often yields a better prediction accuracy and much faster computation. However, most existing screening approaches target on individual predictors, thus are not able to incorporate structured predictors, such as dummy variables and grouped variables. New screening methods for naturally grouped predictors for high dimensional linear regression are presented. Two popular variable screening methods are generalized to the grouped predictors case, and also a novel screening procedure is proposed. Asymptotic sure screening properties for all three methods are established. Also empirical benefits of the screening approaches via simulation and a real data analysis are demonstrated. Specifically, a two-step analysis that does a screening followed by a sparse estimation improves the prediction accuracy as well as computing time, compared to one-stage sparse regression.
Keywords: Grouped variable screening; HOLP; Multicollinearity; SIS; Sparse regression; Sure screening property (search for similar items in EconPapers)
Date: 2020
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (1)
Downloads: (external link)
http://www.sciencedirect.com/science/article/pii/S016794731930249X
Full text for ScienceDirect subscribers only.
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:eee:csdana:v:144:y:2020:i:c:s016794731930249x
DOI: 10.1016/j.csda.2019.106894
Access Statistics for this article
Computational Statistics & Data Analysis is currently edited by S.P. Azen
More articles in Computational Statistics & Data Analysis from Elsevier
Bibliographic data for series maintained by Catherine Liu ().