Use of Bad Training Data for Better Predictions

Grossman, Tal; Lapedes, Alan

Use of Bad Training Data for Better Predictions

Tal Grossman and Alan Lapedes

Abstract: We show how randomly scrambling the output of classes of various fractions of the training data may be used to improve predictive accuracy of a classification algorithm. We present a method for calculating the ``noise sensitivity signature'' of a learning algorithm which is based on scrambling the output classes. This signature can be used to indicate a good match between the complexity of the classifier and the complexity of the data. Use of noise sensitivity signatures is distinctly different from other schemes to avoid overtraining, such as cross-validation, which uses only part of the training data, or various penalty functions, which are not data-adaptive. Noise sensitivity signature methods use all of the training data and are manifestly data-adaptive and non-parametric. They are well suited for situations with limited training data.

Date: 1995-02
References: Add references at CitEc
Citations:

There are no downloads for this item, see the EconPapers FAQ for hints about obtaining it.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:wop:safiwp:95-02-019

Access Statistics for this paper

More papers in Working Papers from Santa Fe Institute Contact information at EDIRC.
Bibliographic data for series maintained by Thomas Krichel ().