EconPapers    
Economics at your fingertips  
 

Imputation of missing values for compositional data using classical and robust methods

K. Hron, M. Templ and Peter Filzmoser

Computational Statistics & Data Analysis, 2010, vol. 54, issue 12, 3095-3107

Abstract: New imputation algorithms for estimating missing values in compositional data are introduced. A first proposal uses the k-nearest neighbor procedure based on the Aitchison distance, a distance measure especially designed for compositional data. It is important to adjust the estimated missing values to the overall size of the compositional parts of the neighbors. As a second proposal an iterative model-based imputation technique is introduced which initially starts from the result of the proposed k-nearest neighbor procedure. The method is based on iterative regressions, thereby accounting for the whole multivariate data information. The regressions have to be performed in a transformed space, and depending on the data quality classical or robust regression techniques can be employed. The proposed methods are tested on a real and on simulated data sets. The results show that the proposed methods outperform standard imputation methods. In the presence of outliers, the model-based method with robust regressions is preferable.

Date: 2010
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (17)

Downloads: (external link)
http://www.sciencedirect.com/science/article/pii/S0167-9473(09)00436-8
Full text for ScienceDirect subscribers only.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:eee:csdana:v:54:y:2010:i:12:p:3095-3107

Access Statistics for this article

Computational Statistics & Data Analysis is currently edited by S.P. Azen

More articles in Computational Statistics & Data Analysis from Elsevier
Bibliographic data for series maintained by Catherine Liu ().

 
Page updated 2025-03-19
Handle: RePEc:eee:csdana:v:54:y:2010:i:12:p:3095-3107