Japanese-Automobile Data
Shuichi Shinmura ()
Additional contact information
Shuichi Shinmura: Seikei University, Faculty of Economics
Chapter Chapter 7 in New Theory of Discriminant Analysis After R. Fisher, 2016, pp 139-161 from Springer
Abstract:
Abstract Japanese-automobile data consist of 29 regular and 15 small cars with six independent variables, such as the emission rate (X1), price (X2), number of seats (X3), CO2 (X4), fuel (X4), and sales (X6). The following points are important for this book: (1) LSD discrimination: We can easily recognize that these data are LSD because X1 and X3 can separate two classes completely by two box–whisker plots. (2) Problem 3: The forward stepwise procedure selects X1, X2, X3, X4, X5, and X6 in this order. Although MNM of Revised IP-OLDF and NM of QDF are zeroes in the one-variable model (X1), QDF misclassifies all regular cars as small cars after X3 enters the model because the X3 value in small cars is four (Problem 3). These data are very suitable for explaining Problem 3 because they are easier than examination scores that use 100 items. (3) Explanation of Method 2 by these data: When we discriminate six microarray datasets by eight LDFs, only Revised IP-OLDF can naturally make the feature-selection and reduce the high-dimensionnal gene space to the small gene subspace that is a linearly separable model. We call these subspaces, “Matroska.” We establish the Matroska feature-selection method for the microarray dataset (Method 2), and the data consist of several disjoint small Matroskas with MNM = 0. Because LSD discrimination is not popular now and Method 2 has several unknown ideas, we explain these ideas by these data in addition to the Swiss banknote data from Chap. 6 and Student linearly separable data in Chap. 4 . If the data are LSD, the full model is the largest Matroska that contains many smaller Matroskas in it. We already know that the smallest Matroska (the basic gene set or subspase, BGS) can describe the Matroska structure completely because MNM decreases monotonously. On the other hand, LASSO attempts to make feature-selection. If it cannot find BGS in the dataset, it cannot explain the dataset structure. Therefore, LASSO researchers have better examine their method by two common data before examining microarray datasets. If they are not successful in these ordinary data, it is not logical for them to expect a successful result for gene analysis. In particular, Japanese-automobile data are simple data for feature-selection because only two one-variable models are linearly separable and BGSs.
Keywords: Linearly separable data (LSD); Matroska Feature-selection method for microarray datasets (Method 2); Problem 3; LASSO; Fisher’s linear discriminant function (Fisher’s LDF); Support vector machine (SVM); Minimum number of misclassifications (minimum NM; MNM); Revised IP-OLDF; Revised IPLP-OLDF; Revised LP-OLDF (search for similar items in EconPapers)
Date: 2016
References: Add references at CitEc
Citations:
There are no downloads for this item, see the EconPapers FAQ for hints about obtaining it.
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:spr:sprchp:978-981-10-2164-0_7
Ordering information: This item can be ordered from
http://www.springer.com/9789811021640
DOI: 10.1007/978-981-10-2164-0_7
Access Statistics for this chapter
More chapters in Springer Books from Springer
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().