On Subsampling Procedures for Support Vector Machines
Roberto Bárcenas,
Maria Gonzalez-Lima,
Joaquin Ortega () and
Adolfo Quiroz
Additional contact information
Roberto Bárcenas: Facultad de Ciencias, Universidad Nacional Autónoma de México, Ciudad de Mexico 04510, Mexico
Maria Gonzalez-Lima: Departamento Matemáticas y Estadística, Universidad del Norte, Barranquilla 080001, Colombia
Joaquin Ortega: CEMSE, King Abdullah University of Science and Technology, Thuwal 23955-6900, Saudi Arabia
Adolfo Quiroz: Departamento de Matemáticas, Universidad de los Andes, Bogota 111711, Colombia
Mathematics, 2022, vol. 10, issue 20, 1-27
Abstract:
Herein, theoretical results are presented to provide insights into the effectiveness of subsampling methods in reducing the amount of instances required in the training stage when applying support vector machines (SVMs) for classification in big data scenarios. Our main theorem states that under some conditions, there exists, with high probability, a feasible solution to the SVM problem for a randomly chosen training subsample, with the corresponding classifier as close as desired (in terms of classification error) to the classifier obtained from training with the complete dataset. The main theorem also reflects the curse of dimensionalityin that the assumptions made for the results are much more restrictive in large dimensions; thus, subsampling methods will perform better in lower dimensions. Additionally, we propose an importance sampling and bagging subsampling method that expands the nearest-neighbors ideas presented in previous work. Using different benchmark examples, the method proposed herein presents a faster solution to the SVM problem (without significant loss in accuracy) compared with the available state-of-the-art techniques.
Keywords: support vector machines; big data; bagging; importance sampling (search for similar items in EconPapers)
JEL-codes: C (search for similar items in EconPapers)
Date: 2022
References: View references in EconPapers View complete reference list from CitEc
Citations:
Downloads: (external link)
https://www.mdpi.com/2227-7390/10/20/3776/pdf (application/pdf)
https://www.mdpi.com/2227-7390/10/20/3776/ (text/html)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:gam:jmathe:v:10:y:2022:i:20:p:3776-:d:941335
Access Statistics for this article
Mathematics is currently edited by Ms. Emma He
More articles in Mathematics from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().