Improving Classifier Performance by Using Fictitious Training Data? A Case Study
Ralf Stecking () and
Klaus B. Schebesch ()
Additional contact information
Ralf Stecking: University of Oldenburg
Klaus B. Schebesch: University “Vasile Goldiş”
A chapter in Operations Research Proceedings 2007, 2008, pp 89-94 from Springer
Abstract:
Abstract Many empirical data describing features of some persons or objects with associated class labels (e.g. credit client features and the recorded defaulting behaviors in our application [5], [6]) are clearly not linearly separable. However, owing to an interplay of relatively sparse data (relating to high dimensional input feature spaces) and a validation procedure like leave-one-out, a nonlinear classification cannot, in many cases, improve this situation but in a minor way. Attributing all the remaining errors to noise seems rather implausible, as data recording is offline and not prone to errors of the type occurring e.g. when measuring process data with (online) sensors. Experiments with classification models on input subsets even suggest that our credit client data contain some hidden redundancy. This was not eliminated by statistical data preprocessing and leads to rather competitive validated models on input subsets and even to slightly superior results for combinations of such input subset base models [3]. These base models all reflect different views of the same data. However, class regions with highly nonlinear boundaries can also occur if important features (i.e. other explaining factors) are for some reason not available (unknown, neglected, etc.). In order to see this, simply project linearly separable data onto a feature subset with smaller dimension.
Keywords: Support Vector Machine; Support Vector Machine Model; Training Point; Credit Scoring; Separable Data (search for similar items in EconPapers)
Date: 2008
References: Add references at CitEc
Citations:
There are no downloads for this item, see the EconPapers FAQ for hints about obtaining it.
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:spr:oprchp:978-3-540-77903-2_14
Ordering information: This item can be ordered from
http://www.springer.com/9783540779032
DOI: 10.1007/978-3-540-77903-2_14
Access Statistics for this chapter
More chapters in Operations Research Proceedings from Springer
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().