Evolutionary methods for unsupervised feature selection using Sammon’s stress function
Amit Saxena (),
Nikhil R. Pal () and
Megha Vora ()
Additional contact information
Amit Saxena: Guru Ghasidas University
Nikhil R. Pal: Indian Stastistical Institute
Megha Vora: Indian Institute of Technology-Madras
Fuzzy Information and Engineering, 2010, vol. 2, issue 3, 229-247
Abstract:
Abstract In this paper, four methods are proposed for feature selection in an unsupervised manner by using genetic algorithms. The proposed methods do not use the class label information but select a set of features using a task independent criterion that can preserve the geometric structure (topology) of the original data in the reduced feature space. One of the components of the fitness function is Sammon’s stress function which tries to preserve the topology of the high dimensional data when reduced into the lower dimensional one. In this context, in addition to using a fitness criterion, we also explore the utility of unfitness criterion to select chromosomes for genetic operations. This ensures higher diversity in the population and helps unfit chromosomes to become more fit. We use four different ways for evaluation of the quality of the features selected: Sammon error, correlation between the inter-point distances in the two spaces, a measure of preservation of cluster structure found in the original and reduced spaces and a classifier performance. The proposed methods are tested on six real data sets with dimensionality varying between 9 and 60. The selected features are found to be excellent in terms of preservation topology (inter-point geometry), cluster structure and classifier performance. We do not compare our methods with other methods because, unlike other methods, using four different ways we check the quality of the selected features by finding how well the selected features preserve the “structure” of the original data.
Keywords: Dimensionality reduction; Feature analysis; Genetic algorithm; Classification techniques (search for similar items in EconPapers)
Date: 2010
References: View complete reference list from CitEc
Citations:
Downloads: (external link)
http://link.springer.com/10.1007/s12543-010-0047-4 Abstract (text/html)
Access to the full text of the articles in this series is restricted.
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:spr:fuzinf:v:2:y:2010:i:3:d:10.1007_s12543-010-0047-4
Ordering information: This journal article can be ordered from
https://www.springer.com/journal/12543
DOI: 10.1007/s12543-010-0047-4
Access Statistics for this article
More articles in Fuzzy Information and Engineering from Springer
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().