The k-NN algorithm for compositional data: a revised approach with and without zero values present
Michail Tsagris
MPRA Paper from University Library of Munich, Germany
Abstract:
In compositional data, an observation is a vector with non-negative components which sum to a constant, typically 1. Data of this type arise in many areas, such as geology, archaeology, biology, economics and political science among others. The goal of this paper is to extend the taxicab metric and a newly suggested metric for com-positional data by employing a power transformation. Both metrics are to be used in the k-nearest neighbours algorithm regardless of the presence of zeros. Examples with real data are exhibited.
Keywords: compositional data; entropy; k-NN algorithm; metric; supervised classification (search for similar items in EconPapers)
JEL-codes: C18 (search for similar items in EconPapers)
Date: 2014-07
New Economics Papers: this item is included in nep-ecm
References: View references in EconPapers View complete reference list from CitEc
Citations:
Published in Journal of Data Science 12.3(2014): pp. 519-534
Downloads: (external link)
https://mpra.ub.uni-muenchen.de/65866/1/MPRA_paper_65866.pdf original version (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:pra:mprapa:65866
Access Statistics for this paper
More papers in MPRA Paper from University Library of Munich, Germany Ludwigstraße 33, D-80539 Munich, Germany. Contact information at EDIRC.
Bibliographic data for series maintained by Joachim Winter ().