Cluster-based multivariate outlier identification and re-weighted regression in linear models
Ekele Alih and
Hong Choon Ong
Journal of Applied Statistics, 2015, vol. 42, issue 5, 938-955
Abstract:
A cluster methodology, motivated by a robust similarity matrix is proposed for identifying likely multivariate outlier structure and to estimate weighted least-square ( WLS ) regression parameters in linear models. The proposed method is an agglomeration of procedures that begins from clustering the n -observations through a test of 'no-outlier hypothesis' ( TONH ) to a weighted least-square regression estimation. The cluster phase partition the n -observations into h -set called main cluster and a minor cluster of size n - h . A robust distance emerge from the main cluster upon which a test of no outlier hypothesis' is conducted. An initial WLS regression estimation is computed from the robust distance obtained from the main cluster. Until convergence, a re-weighted least-squares ( RLS ) regression estimate is updated with weights based on the normalized residuals. The proposed procedure blends an agglomerative hierarchical cluster analysis of a complete linkage through the TONH to the Re-weighted regression estimation phase. Hence, we propose to call it cluster-based re-weighted regression ( CBRR ). The CBRR is compared with three existing procedures using two data sets known to exhibit masking and swamping. The performance of CBRR is further examined through simulation experiment. The results obtained from the data set illustration and the Monte Carlo study shows that the CBRR is effective in detecting multivariate outliers where other methods are susceptible to it. The CBRR does not require enormous computation and is substantially not susceptible to masking and swamping.
Date: 2015
References: View references in EconPapers View complete reference list from CitEc
Citations:
Downloads: (external link)
http://hdl.handle.net/10.1080/02664763.2014.993366 (text/html)
Access to full text is restricted to subscribers.
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:taf:japsta:v:42:y:2015:i:5:p:938-955
Ordering information: This journal article can be ordered from
http://www.tandfonline.com/pricing/journal/CJAS20
DOI: 10.1080/02664763.2014.993366
Access Statistics for this article
Journal of Applied Statistics is currently edited by Robert Aykroyd
More articles in Journal of Applied Statistics from Taylor & Francis Journals
Bibliographic data for series maintained by Chris Longhurst ().