EconPapers    
Economics at your fingertips  
 

On semi-supervised learning

A. Cholaquidis (), R. Fraiman () and M. Sued ()
Additional contact information
A. Cholaquidis: Universidad de la República
R. Fraiman: Universidad de la República
M. Sued: Facultad de Ciencias Exactas y Naturales

TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, 2020, vol. 29, issue 4, No 7, 914-937

Abstract: Abstract Major efforts have been made, mostly in the machine learning literature, to construct good predictors combining unlabelled and labelled data. These methods are known as semi-supervised. They deal with the problem of how to take advantage, if possible, of a huge amount of unlabelled data to perform classification in situations where there are few labelled data. This is not always feasible: it depends on the possibility to infer the labels from the unlabelled data distribution. Nevertheless, several algorithms have been proposed recently. In this work, we present a new method that, under almost necessary conditions, attains asymptotically the performance of the best theoretical rule when the size of the unlabelled sample goes to infinity, even if the size of the labelled sample remains fixed. Its performance and computational time are assessed through simulations and in the well- known “Isolet” real data of phonemes, where a strong dependence on the choice of the initial training sample is shown. The main focus of this work is to elucidate when and why semi-supervised learning works in the asymptotic regime described above. The set of necessary assumptions, although reasonable, show that semi-parametric methods only attain consistency for very well-conditioned problems.

Keywords: Semi-supervised learning; Small training sample; Consistency; 62G08; 68T05; 68Q32 (search for similar items in EconPapers)
Date: 2020
References: View references in EconPapers View complete reference list from CitEc
Citations:

Downloads: (external link)
http://link.springer.com/10.1007/s11749-019-00690-2 Abstract (text/html)
Access to the full text of the articles in this series is restricted.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:spr:testjl:v:29:y:2020:i:4:d:10.1007_s11749-019-00690-2

Ordering information: This journal article can be ordered from
http://www.springer. ... cs/journal/11749/PS2

DOI: 10.1007/s11749-019-00690-2

Access Statistics for this article

TEST: An Official Journal of the Spanish Society of Statistics and Operations Research is currently edited by Alfonso Gordaliza and Ana F. Militino

More articles in TEST: An Official Journal of the Spanish Society of Statistics and Operations Research from Springer, Sociedad de Estadística e Investigación Operativa
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().

 
Page updated 2025-03-20
Handle: RePEc:spr:testjl:v:29:y:2020:i:4:d:10.1007_s11749-019-00690-2