Cluster-weighted modeling with measurement error in covariates
Shaho Zarei
Communications in Statistics - Theory and Methods, 2024, vol. 53, issue 24, 8916-8928
Abstract:
The cluster-weighted model (CWM) is a model-based clustering approach that utilizes a mixture of regression models to cluster data points based on both a response variable Y and covariates 𝑿, where the covariates are assumed to be random. The Gaussian CWM (GCWM) is the most commonly used member of the CWM family, where the Gaussian distribution is adopted for both the covariates and the response given the covariates. In mixture of regression, assignment of data points to the clusters is based on the conditional distribution of the response variable given covariates and is independent of the covariates’ distribution. In CWM, to increase clustering performance, the covariates’ distribution is also used to assign data points to the clusters. Existing researches on CWMs are limited to the directly observed covariates, which may not reflect real-world scenarios where measurement errors (MEs) occur. The measurement error can lead to inconsistent estimates, consequently, produce spurious or obscure clusters. In this article, we assume that random covariates 𝑿 are latent, observed with an independent ME that has the Gaussian distribution. A new generalized expectation maximization algorithm is defined for estimating model parameters. The performance of the proposal is illustrated and compared with the GCWM using both simulated and real data.
Date: 2024
References: Add references at CitEc
Citations:
Downloads: (external link)
http://hdl.handle.net/10.1080/03610926.2024.2311795 (text/html)
Access to full text is restricted to subscribers.
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:taf:lstaxx:v:53:y:2024:i:24:p:8916-8928
Ordering information: This journal article can be ordered from
http://www.tandfonline.com/pricing/journal/lsta20
DOI: 10.1080/03610926.2024.2311795
Access Statistics for this article
Communications in Statistics - Theory and Methods is currently edited by Debbie Iscoe
More articles in Communications in Statistics - Theory and Methods from Taylor & Francis Journals
Bibliographic data for series maintained by Chris Longhurst ().