Improved classi cation for compositional data using the $\alpha$-transformation
Michail Tsagris,
Simon Preston and
Andrew T.A. Wood
MPRA Paper from University Library of Munich, Germany
Abstract:
In compositional data analysis an observation is a vector containing non-negative values, only the relative sizes of which are considered to be of interest. Without loss of generality, a compositional vector can be taken to be a vector of proportions that sum to one. Data of this type arise in many areas including geology, archaeology, biology, economics and political science. In this paper we investigate methods for classi�cation of compositional data. Our approach centres on the idea of using the �-transformation to transform the data and then to classify the transformed data via regularised discriminant analysis and the k-nearest neighbours algorithm. Using the �-transformation generalises two rival approaches in compositional data analysis, one (when α=1) that treats the data as though they were Euclidean, ignoring the compositional constraint, and another (when $\alpha$ = 0) that employs Aitchison's centred log-ratio transformation. A numerical study with several real datasets shows that whether using $\alpha$ = 1 or $\alpha$ = 0 gives better classification performance depends on the dataset, and moreover that using an intermediate value of α can sometimes give better performance than using either 1 or 0.
Keywords: compositional data; classi�cation; �-transformation; �-metric; Jensen-Shannon divergence (search for similar items in EconPapers)
JEL-codes: C18 (search for similar items in EconPapers)
Date: 2016
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (3)
Downloads: (external link)
https://mpra.ub.uni-muenchen.de/67657/1/MPRA_paper_67657.pdf original version (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:pra:mprapa:67657
Access Statistics for this paper
More papers in MPRA Paper from University Library of Munich, Germany Ludwigstraße 33, D-80539 Munich, Germany. Contact information at EDIRC.
Bibliographic data for series maintained by Joachim Winter ().