Clusterwise analysis for multiblock component methods
Stéphanie Bougeard (),
Hervé Abdi (),
Gilbert Saporta () and
Ndèye Niang ()
Additional contact information
Stéphanie Bougeard: Anses (French agency for food, environmental and occupational health safety)
Hervé Abdi: The University of Texas at Dallas
Gilbert Saporta: CEDRIC CNAM
Ndèye Niang: CEDRIC CNAM
Advances in Data Analysis and Classification, 2018, vol. 12, issue 2, 285-313
Abstract Multiblock component methods are applied to data sets for which several blocks of variables are measured on a same set of observations with the goal to analyze the relationships between these blocks of variables. In this article, we focus on multiblock component methods that integrate the information found in several blocks of explanatory variables in order to describe and explain one set of dependent variables. In the following, multiblock PLS and multiblock redundancy analysis are chosen, as particular cases of multiblock component methods when one set of variables is explained by a set of predictor variables that is organized into blocks. Because these multiblock techniques assume that the observations come from a homogeneous population they will provide suboptimal results when the observations actually come from different populations. A strategy to palliate this problem—presented in this article—is to use a technique such as clusterwise regression in order to identify homogeneous clusters of observations. This approach creates two new methods that provide clusters that have their own sets of regression coefficients. This combination of clustering and regression improves the overall quality of the prediction and facilitates the interpretation. In addition, the minimization of a well-defined criterion—by means of a sequential algorithm—ensures that the algorithm converges monotonously. Finally, the proposed method is distribution-free and can be used when the explanatory variables outnumber the observations within clusters. The proposed clusterwise multiblock methods are illustrated with of a simulation study and a (simulated) example from marketing.
Keywords: Multiblock component method; Clusterwise regression; Typological regression; Cluster analysis; Dimension reduction; 62H30; 62H25; 91C20 (search for similar items in EconPapers)
References: View references in EconPapers View complete reference list from CitEc
Citations: Track citations by RSS feed
Downloads: (external link)
http://link.springer.com/10.1007/s11634-017-0296-8 Abstract (text/html)
Access to the full text of the articles in this series is restricted.
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
Persistent link: https://EconPapers.repec.org/RePEc:spr:advdac:v:12:y:2018:i:2:d:10.1007_s11634-017-0296-8
Ordering information: This journal article can be ordered from
http://www.springer. ... ds/journal/11634/PS2
Access Statistics for this article
Advances in Data Analysis and Classification is currently edited by H.-H. Bock, W. Gaul, A. Okada, M. Vichi and C. Weihs
More articles in Advances in Data Analysis and Classification from Springer, German Classification Society - Gesellschaft für Klassifikation (GfKl), Japanese Classification Society (JCS), Classification and Data Analysis Group of the Italian Statistical Society (CLADAG), International Federation of Classification Societies (IFCS)
Bibliographic data for series maintained by Sonal Shukla ().