EconPapers    
Economics at your fingertips  
 

Improving Classifier Performance by Using Fictitious Training Data? A Case Study

Ralf Stecking () and Klaus B. Schebesch ()
Additional contact information
Ralf Stecking: University of Oldenburg
Klaus B. Schebesch: University “Vasile Goldiş”

A chapter in Operations Research Proceedings 2007, 2008, pp 89-94 from Springer

Abstract: Abstract Many empirical data describing features of some persons or objects with associated class labels (e.g. credit client features and the recorded defaulting behaviors in our application [5], [6]) are clearly not linearly separable. However, owing to an interplay of relatively sparse data (relating to high dimensional input feature spaces) and a validation procedure like leave-one-out, a nonlinear classification cannot, in many cases, improve this situation but in a minor way. Attributing all the remaining errors to noise seems rather implausible, as data recording is offline and not prone to errors of the type occurring e.g. when measuring process data with (online) sensors. Experiments with classification models on input subsets even suggest that our credit client data contain some hidden redundancy. This was not eliminated by statistical data preprocessing and leads to rather competitive validated models on input subsets and even to slightly superior results for combinations of such input subset base models [3]. These base models all reflect different views of the same data. However, class regions with highly nonlinear boundaries can also occur if important features (i.e. other explaining factors) are for some reason not available (unknown, neglected, etc.). In order to see this, simply project linearly separable data onto a feature subset with smaller dimension.

Keywords: Support Vector Machine; Support Vector Machine Model; Training Point; Credit Scoring; Separable Data (search for similar items in EconPapers)
Date: 2008
References: Add references at CitEc
Citations:

There are no downloads for this item, see the EconPapers FAQ for hints about obtaining it.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:spr:oprchp:978-3-540-77903-2_14

Ordering information: This item can be ordered from
http://www.springer.com/9783540779032

DOI: 10.1007/978-3-540-77903-2_14

Access Statistics for this chapter

More chapters in Operations Research Proceedings from Springer
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().

 
Page updated 2025-04-01
Handle: RePEc:spr:oprchp:978-3-540-77903-2_14