Comparative Analysis of Manifold Learning-Based Dimension Reduction Methods: A Mathematical Perspective

Yi, Wenting; Bu, Siqi; Lee, Hiu-Hung; Chan, Chun-Hung

Comparative Analysis of Manifold Learning-Based Dimension Reduction Methods: A Mathematical Perspective

Wenting Yi, Siqi Bu (), Hiu-Hung Lee () and Chun-Hung Chan
Additional contact information
Wenting Yi: Centre for Advances in Reliability and Safety (CAiRS), Hong Kong SAR 999077, China
Siqi Bu: Centre for Advances in Reliability and Safety (CAiRS), Hong Kong SAR 999077, China
Hiu-Hung Lee: Centre for Advances in Reliability and Safety (CAiRS), Hong Kong SAR 999077, China
Chun-Hung Chan: Centre for Advances in Reliability and Safety (CAiRS), Hong Kong SAR 999077, China

Mathematics, 2024, vol. 12, issue 15, 1-21

Abstract: Manifold learning-based approaches have emerged as prominent techniques for dimensionality reduction. Among these methods, t-Distributed Stochastic Neighbor Embedding (t-SNE) and Uniform Manifold Approximation and Projection (UMAP) stand out as two of the most widely used and effective approaches. While both methods share similar underlying procedures, empirical observations indicate two distinctive properties: global data structure preservation and computational efficiency. However, the underlying mathematical principles behind these distinctions remain elusive. To address this gap, this study presents a comparative analysis of the subprocesses involved in these methods, aiming to elucidate the mathematical mechanisms underlying the observed distinctions. By meticulously examining the equation formulations, the mathematical mechanisms contributing to global data structure preservation and computational efficiency are elucidated. To validate the theoretical analysis, data are collected through a laboratory experiment, and an open-source dataset is utilized for validation across different datasets. The consistent alignment of results obtained from both balanced and unbalanced datasets robustly confirms the study’s findings. The insights gained from this study provide a deeper understanding of the mathematical underpinnings of t-SNE and UMAP, enabling more informed and effective use of these dimensionality reduction techniques in various applications, such as anomaly detection, natural language processing, and bioinformatics.

Keywords: manifold learning; dimension reduction; spectral embedding; fuzzy topology; stochastic gradient descent (search for similar items in EconPapers)
JEL-codes: C (search for similar items in EconPapers)
Date: 2024
References: View complete reference list from CitEc
Citations:

Downloads: (external link)
https://www.mdpi.com/2227-7390/12/15/2388/pdf (application/pdf)
https://www.mdpi.com/2227-7390/12/15/2388/ (text/html)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:gam:jmathe:v:12:y:2024:i:15:p:2388-:d:1447130

Access Statistics for this article

Mathematics is currently edited by Ms. Emma He

More articles in Mathematics from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().