Econometric Analysis with Compositional and Non-Compositional Covariates
Michael Ben-Gad
Authors registered in the RePEc Author Service: Fabrice Defever
Working Papers from Department of Economics, City University London
Abstract:
In this paper I consider how best to incorporate compositional data (shares of a whole which can be represented as points on a simplex) together with noncompositional data as covariates in a linear regression. The standard method for incorporating compositional data in regressions is to omit one share to overcome the problem of singularity. I demonstrate that doing so ignores the compositional nature of the data and the resulting models are not objects in a vector space, which in turn reduces their usefulness. In terms of Aitchison geometry - the only geometry that can generate a vector space on a simplex - I show how this method also grossly distorts the relationship between points in the compositional data set. Futhermore, the regression coefficients that result are not permutation invariant, so unless there is an obvious baseline category to be omitted with which the other variables in the composition ought naturally to be compared, this approach gives researchers latitude to choose the permutation of the model that supports a particular hypothesis or appears most convincing in terms of p-values. The alternatives in this paper build on work by Aitchison (1982, 1986) on additive logarithmic ratio (ALR) transformations and Egozcue et al. (2003) on isometric logarithmic ratio (ILR) transformations. Transforming the compositional data using ALRs generates regressions that are permutation invariant and hyperplanes in a vector space. However, ALRs translate the points in the simplex into coordinates relative to an oblique basis, so the angles and distances between the data points remain somewhat distorted|though this distortion is inversely related to the number of shares in the composition. By contrast, ILRs eliminate the distortion by translating the points into coordinates relative to an orthogonal basis. However, the resulting regressions are no longer permutation invariant and are difficult to interpret. To overcome these shortcomings, Hron et al. (2012) suggest using ILRs, but combining the coefficient estimates across all the different permutations to produce one statistical model. I demonstrate that estimating a separate regression for each permutation is unnecessary - estimating either a single regression using ALR coordinates or a constrained regression and then multiplying the resulting regression coefficients and standard errors associated with the compositional variables by a simple factor is sufficient. Though log-ratios incorporate more information about the nature of compositional data as coordinates in a simplex, I demonstrate that it does not exacerbate the inherent multicollinearity present in compositional datasets. Throughout, I use economic growth regressions with compositional data on ten religious categories, similar to Barro and McCleary (2003) and McCleary and Barro (2006), to demonstrate and contrast all these different approaches.
Keywords: Compositional Data; Aitchison Geometry; Isometric Logarithmic Ratios; Economic Growth Regressions (search for similar items in EconPapers)
Date: 2022-10-02
New Economics Papers: this item is included in nep-ecm
References: View references in EconPapers View complete reference list from CitEc
Citations:
Downloads: (external link)
https://openaccess.city.ac.uk/id/eprint/28957/1/Dept_Econ_WP2201.pdf
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:cty:dpaper:22/01
Access Statistics for this paper
More papers in Working Papers from Department of Economics, City University London Department of Economics, Social Sciences Building, City University London, Whiskin Street, London, EC1R 0JD, United Kingdom,. Contact information at EDIRC.
Bibliographic data for series maintained by Research Publications Librarian ().