Classification And Regression Tree analysis (CART) with Stata
Wim van Putten ()
Additional contact information Wim van Putten: Department of Statistics, Erasmus MC - Daniel den Hoed Cancer Center, Rotterdam, Netherlands
Abstract:
Classification and Regression Tree analysis can be applied for the identification and assessment of prognostic factors in clinical research. It involves repeated subdivisions of a group of subjects on the basis of the choice of optimal cutpoints of binary, ordinal or continuous covariates, that maximizes a certain split criterion. I will describe a specific implementation of CART as Stata ado file cart.ado for failure time data with as split criterion an adjusted P-value. The P-value is associated with the chisquare logrank statistic based on residuals. The adjustment is for the multiple testing associated with the search for the optimal cutpoint with the maximum chisquare value (Lausen, 1997). Examples of applications are given. CART has a serious risk of overfitting. However it can be a useful exploratory tool in addition to more standard regression type techniques. [Reference: Lausen B. et al, The regression tree method and its application in nutritional epidemiology. Informatik, Biometrie und Epidemiologie in Medizin und Biologie 28 (1), 1-13, 1997.]
More papers in Dutch-German Stata Users' Group Meetings 2002 from Stata Users Group Contact information at EDIRC. Series data maintained by Christopher F Baum ().
This site is part of RePEc
and all the data displayed here is part of the RePEc data set.
Is your work missing from RePEc? Here is how to
contribute.
Questions or problems? Check the EconPapers FAQ or send mail to .