Rootlets Hierarchical Principal Component Analysis for Revealing Nested Dependencies in Hierarchical Data
Korey P. Wylie () and
Jason R. Tregellas
Additional contact information
Korey P. Wylie: Department of Psychiatry, University of Colorado School of Medicine, Anschutz Medical Campus, Anschutz Health Sciences Building, 1890 N Revere Ct, Aurora, CO 80045, USA
Jason R. Tregellas: Department of Psychiatry, University of Colorado School of Medicine, Anschutz Medical Campus, Anschutz Health Sciences Building, 1890 N Revere Ct, Aurora, CO 80045, USA
Mathematics, 2024, vol. 13, issue 1, 1-23
Abstract:
Hierarchical clustering analysis (HCA) is a widely used unsupervised learning method. Limitations of HCA, however, include imposing an artificial hierarchy onto non-hierarchical data and fixed two-way mergers at every level. To address this, the current work describes a novel rootlets hierarchical principal component analysis (hPCA). This method extends typical hPCA using multivariate statistics to construct adaptive multiway mergers and Riemannian geometry to visualize nested dependencies. The rootlets hPCA algorithm and its projection onto the Poincaré disk are presented as examples of this extended framework. The algorithm constructs high-dimensional mergers using a single parameter, interpreted as a p -value. It decomposes a similarity matrix from GL( m , ℝ) using a sequence of rotations from SO( k ), k << m . Analysis shows that the rootlets algorithm limits the number of distinct eigenvalues for any merger. Nested clusters of arbitrary size but equal correlations are constructed and merged using their leading principal components. The visualization method then maps elements of SO( k ) onto a low-dimensional hyperbolic manifold, the Poincaré disk. Rootlets hPCA was validated using simulated datasets with known hierarchical structure, and a neuroimaging dataset with an unknown hierarchy. Experiments demonstrate that rootlets hPCA accurately reconstructs known hierarchies and, unlike HCA, does not impose a hierarchy on data.
Keywords: eigendecomposition; multivariate statistics; hyperbolic manifold; Riemannian geometry; manifold learning (search for similar items in EconPapers)
JEL-codes: C (search for similar items in EconPapers)
Date: 2024
References: View complete reference list from CitEc
Citations:
Downloads: (external link)
https://www.mdpi.com/2227-7390/13/1/72/pdf (application/pdf)
https://www.mdpi.com/2227-7390/13/1/72/ (text/html)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:gam:jmathe:v:13:y:2024:i:1:p:72-:d:1555350
Access Statistics for this article
Mathematics is currently edited by Ms. Emma He
More articles in Mathematics from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().