EconPapers    
Economics at your fingertips  
 

Linear Dimensionality Reduction: What Is Better?

Mohit Baliyan and Evgeny M. Mirkes ()
Additional contact information
Mohit Baliyan: School of Computing and Mathematical Sciences, University of Leicester, Leicester LE1 7RH, UK
Evgeny M. Mirkes: School of Computing and Mathematical Sciences, University of Leicester, Leicester LE1 7RH, UK

Data, 2025, vol. 10, issue 5, 1-21

Abstract: This research paper focuses on dimensionality reduction, which is a major subproblem in any data processing operation. Dimensionality reduction based on principal components is the most used methodology. Our paper examines three heuristics, namely Kaiser’s rule, the broken stick, and the conditional number rule, for selecting informative principal components when using principal component analysis to reduce high-dimensional data to lower dimensions. This study uses 22 classification datasets and three classifiers, namely Fisher’s discriminant classifier, logistic regression, and K nearest neighbors, to test the effectiveness of the three heuristics. The results show that there is no universal answer to the best intrinsic dimension, but the conditional number heuristic performs better, on average. This means that the conditional number heuristic is the best candidate for automatic data pre-processing.

Keywords: principal components; dimensionality reduction (search for similar items in EconPapers)
JEL-codes: C8 C80 C81 C82 C83 (search for similar items in EconPapers)
Date: 2025
References: Add references at CitEc
Citations:

Downloads: (external link)
https://www.mdpi.com/2306-5729/10/5/70/pdf (application/pdf)
https://www.mdpi.com/2306-5729/10/5/70/ (text/html)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:gam:jdataj:v:10:y:2025:i:5:p:70-:d:1650326

Access Statistics for this article

Data is currently edited by Ms. Cecilia Yang

More articles in Data from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().

 
Page updated 2025-05-07
Handle: RePEc:gam:jdataj:v:10:y:2025:i:5:p:70-:d:1650326