PBoostGA: pseudo-boosting genetic algorithm for variable ranking and selection
Chun-Xia Zhang (),
Jiang-She Zhang and
Sang-Woon Kim
Additional contact information
Chun-Xia Zhang: Xi’an Jiaotong University
Jiang-She Zhang: Xi’an Jiaotong University
Sang-Woon Kim: Myongji University
Computational Statistics, 2016, vol. 31, issue 4, No 1, 1237-1262
Abstract:
Abstract Variable selection has consistently been a hot topic in linear regression models, especially when facing with high-dimensional data. Variable ranking, an advanced form of selection, is actually more fundamental since selection can be realized by thresholding once the variables are ranked suitably. In recent years, ensemble learning has gained a significant interest in the context of variable selection due to its great potential to improve selection accuracy and to reduce the risk of falsely including some unimportant variables. Motivated by the widespread success of boosting algorithms, a novel ensemble method PBoostGA is developed in this paper to implement variable ranking and selection in linear regression models. In PBoostGA, a weight distribution is maintained over the training set and genetic algorithm is adopted as its base learner. Initially, equal weight is assigned to each instance. According to the weight updating and ensemble member generating mechanism like AdaBoost.RT, a series of slightly different importance measures are sequentially produced for each variable. Finally, the candidate variables are ordered in the light of the average importance measure and some significant variables are then selected by a thresholding rule. Both simulation results and a real data illustration show the effectiveness of PBoostGA in comparison with some existing counterparts. In particular, PBoostGA has stronger ability to exclude redundant variables.
Keywords: Variable selection; Variable ranking; Genetic algorithm; Ensemble learning; Variable selection ensemble; Boosting (search for similar items in EconPapers)
JEL-codes: C15 (search for similar items in EconPapers)
Date: 2016
References: View references in EconPapers View complete reference list from CitEc
Citations:
Downloads: (external link)
http://link.springer.com/10.1007/s00180-016-0652-8 Abstract (text/html)
Access to the full text of the articles in this series is restricted.
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:spr:compst:v:31:y:2016:i:4:d:10.1007_s00180-016-0652-8
Ordering information: This journal article can be ordered from
http://www.springer.com/statistics/journal/180/PS2
DOI: 10.1007/s00180-016-0652-8
Access Statistics for this article
Computational Statistics is currently edited by Wataru Sakamoto, Ricardo Cao and Jürgen Symanzik
More articles in Computational Statistics from Springer
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().