Statistical Evaluation of Genetic Programming

Kaboudan, M. A.

Statistical Evaluation of Genetic Programming

M. A. Kaboudan ()
Additional contact information
M. A. Kaboudan: Penn State Lehigh Valley

No 1031, Computing in Economics and Finance 1999 from Society for Computational Economics

Abstract: A recent advance in genetic computations is the heuristic prediction model (symbolic regression), which have received little statistical scrutiny. Diagnostic checks of genetically evolved models (GEMs) as a forecasting method are therefore essential. This requires assessing the statistical properties of errors produced by GEMs. Since the predicted models and their forecasts are produced artificially by a computer program, little controls the final model specification. However, it is of interest to understand the final specification and to know the statistical characteristics of its errors, particularly if artificially produced models furnish better forecasts than humanly conceived ones. This paper's main concern is the statistical analysis of errors from genetically evolved models. Genetic programming (GP) is one of two computational algorithms for evolving regression models, the other being evolutionary programming (EP). GP-QUICK computer code written in C ++ evolves the regression models for this study. GP-QUICK replicates an original GP program in LISP by Koza. Both are designed to evolve regression models randomly, finding one that replicates the series' data-generating process best. Prediction errors from GP evolved regression models are tested for whiteness (or autocorrelation) and for normality. Well-established diagnostic tools for linear time-series modeling apply also to nonlinear models. Only diagnostic methods using errors without having to replicate the models that produced them are selected and applied to series. This restriction is avoids reproducing the resulting genetically evolved equations. These equations are generated by a random selection mechanism almost impossible to replicate with GP unless the process is deterministic, and they are usually too complex for standard statistical software to reproduce and analyze. The diagnostic methods are selected for their simplicity and speed of execution without sacrificing reliability. This paper contains four other sections. One presents the diagnostic tools to determine the statistical properties of residuals produced by GEMs. Residuals from evolved models representing systems with known characteristics are used to evaluate the statistical performance of GEMs. Another furnishes six data-generating processes representing linear, linear-stochastic, nonlinear, nonlinear-stochastic, and pseudo-random systems for which models are evolved and residuals computed. The final contains those residuals' diagnostics. Diagnostic tools include the Kolmogorov-Smirnov test for whiteness developed by Durbin (1969) in addition to statistical testing of the null hypotheses that the fitted residuals' mean, skewness, and kurtosis are independently equal to zero. Conclusions and future research are given.

Date: 1999-03-01
References: Add references at CitEc
Citations:

There are no downloads for this item, see the EconPapers FAQ for hints about obtaining it.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:sce:scecf9:1031

Access Statistics for this paper

More papers in Computing in Economics and Finance 1999 from Society for Computational Economics CEF99, Boston College, Department of Economics, Chestnut Hill MA 02467 USA. Contact information at EDIRC.
Bibliographic data for series maintained by Christopher F. Baum ().