EconPapers    
Economics at your fingertips  
 

PFA-Nipals: An Unsupervised Principal Feature Selection Based on Nonlinear Estimation by Iterative Partial Least Squares

Emilio Castillo-Ibarra, Marco A. Alsina, Cesar A. Astudillo and Ignacio Fuenzalida-Henríquez ()
Additional contact information
Emilio Castillo-Ibarra: Engineering Systems Doctoral Program, Faculty of Engineering, Universidad de Talca, Campus Curicó, Curicó 3340000, Chile
Marco A. Alsina: Faculty of Engineering, Architecture and Design, Universidad San Sebastian, Bellavista 7, Santiago 8420524, Chile
Cesar A. Astudillo: Department of Computer Science, Faculty of Engineering, University of Talca, Campus Curicó, Curicó 3340000, Chile
Ignacio Fuenzalida-Henríquez: Building Management and Engineering Department, Faculty of Engineering, University of Talca, Campus Curicó, Curicó 3340000, Chile

Mathematics, 2023, vol. 11, issue 19, 1-25

Abstract: Unsupervised feature selection (UFS) has received great interest in various areas of research that require dimensionality reduction, including machine learning, data mining, and statistical analysis. However, UFS algorithms are known to perform poorly on datasets with missing data, exhibiting a significant computational load and learning bias. In this work, we propose a novel and robust UFS method, designated PFA-Nipals, that works with missing data without the need for deletion or imputation. This is achieved by considering an iterative nonlinear estimation of principal components by partial least squares, while the relevant features are selected through minibatch K-means clustering. The proposed method is successfully applied to select the relevant features of a robust health dataset with missing data, outperforming other UFS methods in terms of computational load and learning bias. Furthermore, the proposed method is capable of finding a consistent set of relevant features without biasing the explained variability, even under increasing missing data. Finally, it is expected that the proposed method could be used in several areas, such as machine learning and big data with applications in different areas of the medical and engineering sciences.

Keywords: unsupervised feature selection; Nipals; clustering; missing data; interpretability (search for similar items in EconPapers)
JEL-codes: C (search for similar items in EconPapers)
Date: 2023
References: View references in EconPapers View complete reference list from CitEc
Citations:

Downloads: (external link)
https://www.mdpi.com/2227-7390/11/19/4154/pdf (application/pdf)
https://www.mdpi.com/2227-7390/11/19/4154/ (text/html)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:gam:jmathe:v:11:y:2023:i:19:p:4154-:d:1252858

Access Statistics for this article

Mathematics is currently edited by Ms. Emma He

More articles in Mathematics from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().

 
Page updated 2025-03-19
Handle: RePEc:gam:jmathe:v:11:y:2023:i:19:p:4154-:d:1252858