Selection and statistical analysis of compositional ratios
Michael Greenacre
Economics Working Papers from Department of Economics and Business, Universitat Pompeu Fabra
Abstract:
Compositional data are nonnegative data with the property of closure: that is, each set of values on their components, or so-called parts, has a fixed sum, usually 1 or 100%. Compositional data cannot be analyzed by conventional statistical methods, since the value of any part depends on the choice of the other parts of the composition of interest. For example, reporting the mean and standard deviation of a specific part makes no sense, neither does the correlation between two parts. I propose that a small set of ratios of parts can be determined, either by expert choice or by automatic selection, which effectively replaces the compositional data set. This set can be determined to explain 100% of the variance in the compositional data, or as close to 100% as required. These part ratios can then be validly summarized and analyzed by conventional univariate methods, as well as multivariate methods, where the ratios are preferably log-transformed.
Keywords: compositional data; logarithmic transformation; log-ratio analysis; multivariate analysis; ratios; univariate statistics. (search for similar items in EconPapers)
Date: 2016-08
New Economics Papers: this item is included in nep-ecm
References: View references in EconPapers View complete reference list from CitEc
Citations:
Downloads: (external link)
https://econ-papers.upf.edu/papers/1551.pdf Whole Paper (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:upf:upfgen:1551
Access Statistics for this paper
More papers in Economics Working Papers from Department of Economics and Business, Universitat Pompeu Fabra
Bibliographic data for series maintained by ( this e-mail address is bad, please contact ).