A Rough Based Hybrid Binary PSO Algorithm for Flat Feature Selection and Classification in Gene Expression Data
Suresh Dara (),
Haider Banka () and
Chandra Sekhara Rao Annavarapu ()
Additional contact information
Suresh Dara: B.V. Raju Inistitute of Technology
Haider Banka: Indian Institute of Technology (ISM)
Chandra Sekhara Rao Annavarapu: Indian Institute of Technology (ISM)
Annals of Data Science, 2017, vol. 4, issue 3, No 3, 360 pages
Abstract:
Abstract Feature selection in high dimensional data, particularly, in gene expression data, is one of the challenging task in bioinformatics due to the curse of dimensionality, data redundancy and noise values. In gene expression data, insignificant features causes poor classification, hence feature selection reduces feature subset, improving classification accuracy. Feature selection algorithms in gene expression data(such as filter based, wrapper based and hybrid methods) performing poor accuracy, where as few methods takes too much time to converge for an acceptable results. For example, in NSGA-II, over 10,000 generations, on an average, to converge in the search space. where it incurs increased computational time. Proposed rough based hybrid binary PSO algorithm, which uses a heuristic based fast processing strategy to reduce crude domain features by statistical elimination of redundant features and then discretized subsequently into a binary table, known as distinction table, in rough set theory. This distinction table is later used as input to evaluate and optimize the objectives functions i.e., to generate reduct in rough set theory. The proposed hybrid binary PSO is then used to tune the objective functions, to choose the most important features (i:e:reduct). The fitness function is used in such a way that it can reduce the cardinality of the features and at the same time, improve the classification performance as well. Results have been demonstrated to show the effectiveness of the proposed method, on existing three benchmark datasets (i.e. colon cancer, lymphoma and leukemia data), from literature.
Keywords: Classifications; Binary PSO; Microarray gene expression; Feature selection; Rough set theory (search for similar items in EconPapers)
Date: 2017
References: View references in EconPapers View complete reference list from CitEc
Citations:
Downloads: (external link)
http://link.springer.com/10.1007/s40745-017-0106-3 Abstract (text/html)
Access to the full text of the articles in this series is restricted.
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:spr:aodasc:v:4:y:2017:i:3:d:10.1007_s40745-017-0106-3
Ordering information: This journal article can be ordered from
https://www.springer ... gement/journal/40745
DOI: 10.1007/s40745-017-0106-3
Access Statistics for this article
Annals of Data Science is currently edited by Yong Shi
More articles in Annals of Data Science from Springer
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().