EconPapers    
Economics at your fingertips  
 

The Proportion for Splitting Data into Training and Test Set for the Bootstrap in Classification Problems

Vrigazova Borislava ()
Additional contact information
Vrigazova Borislava: Sofia University, Faculty of Economics and Business Administration, Bulgaria

Business Systems Research, 2021, vol. 12, issue 1, 228-242

Abstract: Background: The bootstrap can be alternative to cross-validation as a training/test set splitting method since it minimizes the computing time in classification problems in comparison to the tenfold cross-validation. Objectives: Тhis research investigates what proportion should be used to split the dataset into the training and the testing set so that the bootstrap might be competitive in terms of accuracy to other resampling methods. Methods/Approach: Different train/test split proportions are used with the following resampling methods: the bootstrap, the leave-one-out cross-validation, the tenfold cross-validation, and the random repeated train/test split to test their performance on several classification methods. The classification methods used include the logistic regression, the decision tree, and the k-nearest neighbours. Results: The findings suggest that using a different structure of the test set (e.g. 30/70, 20/80) can further optimize the performance of the bootstrap when applied to the logistic regression and the decision tree. For the k-nearest neighbour, the tenfold cross-validation with a 70/30 train/test splitting ratio is recommended. Conclusions: Depending on the characteristics and the preliminary transformations of the variables, the bootstrap can improve the accuracy of the classification problem.

Keywords: the bootstrap; classification; cross-validation; repeated train/test splitting (search for similar items in EconPapers)
JEL-codes: C38 C52 C55 (search for similar items in EconPapers)
Date: 2021
References: View references in EconPapers View complete reference list from CitEc
Citations:

Downloads: (external link)
https://doi.org/10.2478/bsrj-2021-0015 (text/html)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:bit:bsrysr:v:12:y:2021:i:1:p:228-242:n:9

DOI: 10.2478/bsrj-2021-0015

Access Statistics for this article

Business Systems Research is currently edited by Mirjana Pejić Bach

More articles in Business Systems Research from Sciendo
Bibliographic data for series maintained by Peter Golla ().

 
Page updated 2025-03-19
Handle: RePEc:bit:bsrysr:v:12:y:2021:i:1:p:228-242:n:9