Abstract:
Classical canonical correlation analysis is one of the fundamental tools in statistics to investigate the "linear" association between two sets of variables. We propose a method, called semiparametric canonical analysis, to generalize canonical correlation analysis to incorporate the important "non-linear" association. Semiparametric canonical analysis is easy to implement and interpret. Statistical properties are proved. A consistent estimation method is developed. Selection of significant semiparametric canonical analysis components is discussed. Simulations suggest that the methods proposed have satisfactory performance in finite samples. One environmental data set and one data set in social science are investigated, in which non-linear canonical associations are observed and interpreted. Copyright (c) 2008 Royal Statistical Society.