Variables selection in observational and experimental studies
Joachim Kunert and
Claus Weihs
No 2000,22, Technical Reports from Technische Universität Dortmund, Sonderforschungsbereich 475: Komplexitätsreduktion in multivariaten Datenstrukturen
Abstract:
This paper discusses whether differences in the data structure of observational and experimental studies should lead to different strategies for variable selection. On the one hand, it is argued that outliers in the predictor variables have to be treated differently in the two kinds of studies. In experimental studies this results in philosophical problems with the applicability of cross validation. On the other hand, it is shown, however, that a well designed experiment might lead to a factor structure very appropriate for cross validation, namely a certain balance in the observations together with orthogonality of the factors. This might be the reason why in practice cross validation has proven to be a valuable tool for variable selection also in experimental studies. In contrast, however, it is shown that variables selection based on cross validation is not appropriate for saturated orthogonal designs. After this fundamental argumentation, we illustrate by a number of examples that the same methods for variable selection can be successfully applied in observational as well as experimental studies.
Keywords: variables selection; stepwise regression; cross validation; principal components; screening; optimization (search for similar items in EconPapers)
Date: 2000
References: Add references at CitEc
Citations:
Downloads: (external link)
https://www.econstor.eu/bitstream/10419/77095/2/2000-22.pdf (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:zbw:sfb475:200022
Access Statistics for this paper
More papers in Technical Reports from Technische Universität Dortmund, Sonderforschungsbereich 475: Komplexitätsreduktion in multivariaten Datenstrukturen Contact information at EDIRC.
Bibliographic data for series maintained by ZBW - Leibniz Information Centre for Economics ().