EconPapers    
Economics at your fingertips  
 

Mixed-integer quadratic optimization and iterative clustering techniques for semi-supervised support vector machines

Jan Pablo Burgard (), Maria Eduarda Pinheiro () and Martin Schmidt ()
Additional contact information
Jan Pablo Burgard: Trier University
Maria Eduarda Pinheiro: Trier University
Martin Schmidt: Trier University

TOP: An Official Journal of the Spanish Society of Statistics and Operations Research, 2024, vol. 32, issue 3, No 3, 428 pages

Abstract: Abstract Among the most famous algorithms for solving classification problems are support vector machines (SVMs), which find a separating hyperplane for a set of labeled data points. In some applications, however, labels are only available for a subset of points. Furthermore, this subset can be non-representative, e.g., due to self-selection in a survey. Semi-supervised SVMs tackle the setting of labeled and unlabeled data and can often improve the reliability of the results. Moreover, additional information about the size of the classes can be available from undisclosed sources. We propose a mixed-integer quadratic optimization (MIQP) model that covers the setting of labeled and unlabeled data points as well as the overall number of points in each class. Since the MIQP’s solution time rapidly grows as the number of variables increases, we introduce an iterative clustering approach to reduce the model’s size. Moreover, we present an update rule for the required big-M values, prove the correctness of the iterative clustering method as well as derive tailored dimension-reduction and warm-starting techniques. Our numerical results show that our approach leads to a similar accuracy and precision than the MIQP formulation but at much lower computational cost. Thus, we can solve larger problems. With respect to the original SVM formulation, we observe that our approach has even better accuracy and precision for biased samples.

Keywords: Semi-supervised learning; Support vector machines; Clustering; Mixed-integer quadratic optimization; 90C11; 90C90; 90-08; 68T99 (search for similar items in EconPapers)
Date: 2024
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (1)

Downloads: (external link)
http://link.springer.com/10.1007/s11750-024-00668-w Abstract (text/html)
Access to the full text of the articles in this series is restricted.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:spr:topjnl:v:32:y:2024:i:3:d:10.1007_s11750-024-00668-w

Ordering information: This journal article can be ordered from
http://link.springer.de/orders.htm

DOI: 10.1007/s11750-024-00668-w

Access Statistics for this article

TOP: An Official Journal of the Spanish Society of Statistics and Operations Research is currently edited by Juan José Salazar González and Gustavo Bergantiños

More articles in TOP: An Official Journal of the Spanish Society of Statistics and Operations Research from Springer, Sociedad de Estadística e Investigación Operativa
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().

 
Page updated 2025-03-20
Handle: RePEc:spr:topjnl:v:32:y:2024:i:3:d:10.1007_s11750-024-00668-w