Natural data structure extracted from neighborhood-similarity graphs
Tom Lorimer,
Karlis Kanders and
Ruedi Stoop
Chaos, Solitons & Fractals, 2019, vol. 119, issue C, 326-331
Abstract:
‘Big’ high-dimensional data are commonly analyzed in low-dimensions, after performing a dimensionality reduction step that inherently distorts the data structure. For a similar analysis, clustering methods are also often used. These methods introduce a bias as well, either by starting from the assumption of a particular, often geometric, property of the clusters, or by using iterative schemes to enhance cluster contours, with consequences that are hard to control. The goal of data analysis should, however, be to encode and detect structural data features at all scales and densities simultaneously, without assuming a parametric form of data point distances, or modifying them. Here, we propose a novel approach that directly encodes data point neighborhood similarities as a sparse graph. Our non-iterative framework permits a transparent interpretation of data, without altering the original data dimension and metric. Several natural and synthetic data applications demonstrate the efficacy of our novel method.
Keywords: Data complexity; Data networks; Big data; Clustering (search for similar items in EconPapers)
Date: 2019
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (1)
Downloads: (external link)
http://www.sciencedirect.com/science/article/pii/S0960077919300104
Full text for ScienceDirect subscribers only
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:eee:chsofr:v:119:y:2019:i:c:p:326-331
DOI: 10.1016/j.chaos.2018.12.033
Access Statistics for this article
Chaos, Solitons & Fractals is currently edited by Stefano Boccaletti and Stelios Bekiros
More articles in Chaos, Solitons & Fractals from Elsevier
Bibliographic data for series maintained by Thayer, Thomas R. ().