Data-Driven Identification of Key Variables

Yuan, Bo; Klir, George

Data-Driven Identification of Key Variables

Bo Yuan and George Klir
Additional contact information
Bo Yuan: New Mexico Highlands University, NASA Center for Autonomous Control Engineering Dept of Engineering
George Klir: Binghamton University-SUNY, Center for Intelligent Systems and Dept of Systems Science & Industrial Engineering

Chapter 7 in Intelligent Hybrid Systems, 1997, pp 161-187 from Springer

Abstract: Abstract In this chapter, we investigate the following problem: given a data set involving n variables, determine key variables that contribute most to a specific partition of this data set. This problem has a broad applicability, even though it emerged in the context of a particular engineering application—the process of manufacturing electric circuit boards. Two distinct approaches are used for dealing with the problem, each resulting in a particular algorithm. Both algorithms employ evolutionary computation. The first approach is based on the well-known fuzzy c-means algorithm. The principal idea is that we use the full class of Mahalanobis distances, each of which weights the variables involved in a particular way. Using this class of distances, we search by an evolutionary algorithm for the optimal distance—one under which the fuzzy c-means algorithm produces a fuzzy partition of the given data set that is as close as possible to the given crisp partition. The contribution of each variable to this partition is then inferred from parameter values of the optimal Mahalanobis distance. The second approach is based on fuzzy measures. The principal idea is that we consider each data vector as an evaluation function of an object with respect to several features, represented by the variables involved. This allows us to aggregate values of the variables at each data vector by the fuzzy integral with respect to a particular fuzzy measure that specifies the significance of the various subsets of variables. The fuzzy c-means algorithm is then applied to the aggregated values under different fuzzy measures. An evolutionary algorithm is used to search for the optimal fuzzy measure—one under which the fuzzy c-means algorithm produces a fuzzy partition that is as close as possible to the given crisp partition. The contribution of each subset of variables to this partition is then inferred from the optimal fuzzy measure.

Keywords: Evolutionary Algorithm; Mahalanobis Distance; Positive Definite Matrix; Cholesky Factor; Fuzzy Measure (search for similar items in EconPapers)
Date: 1997
References: Add references at CitEc
Citations:

There are no downloads for this item, see the EconPapers FAQ for hints about obtaining it.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:spr:sprchp:978-1-4615-6191-0_7

Ordering information: This item can be ordered from
http://www.springer.com/9781461561910

DOI: 10.1007/978-1-4615-6191-0_7

Access Statistics for this chapter

More chapters in Springer Books from Springer
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().