A Semi-Supervised Active Learning Method for Structured Data Enhancement with Small Samples

Leng, Fangling; Li, Fan; Lv, Wei; Bao, Yubin; Liu, Xiaofeng; Zhang, Tiancheng; Yu, Ge

A Semi-Supervised Active Learning Method for Structured Data Enhancement with Small Samples

Fangling Leng (), Fan Li, Wei Lv, Yubin Bao, Xiaofeng Liu, Tiancheng Zhang and Ge Yu
Additional contact information
Fangling Leng: School of Computer Science and Engineering, Northeastern University, Shenyang 110169, China
Fan Li: School of Computer Science and Engineering, Northeastern University, Shenyang 110169, China
Wei Lv: School of Computer Science and Engineering, Northeastern University, Shenyang 110169, China
Yubin Bao: School of Computer Science and Engineering, Northeastern University, Shenyang 110169, China
Xiaofeng Liu: School of Computer Science and Engineering, Northeastern University, Shenyang 110169, China
Tiancheng Zhang: School of Computer Science and Engineering, Northeastern University, Shenyang 110169, China
Ge Yu: School of Computer Science and Engineering, Northeastern University, Shenyang 110169, China

Mathematics, 2024, vol. 12, issue 17, 1-22

Abstract: In order to solve the problems of the small capacity of structured data and uneven distribution among classes in machine learning tasks, a supervised generation method for structured data called WAGAN and a cyclic sampling method named SACS (Semi-supervised and Active-learning Cyclic Sampling), based on semi-supervised active learning, are proposed. The loss function and neural network structure are optimized, and the quantity and quality of the small sample set are enhanced. To enhance the reliability of generating pseudo-labels, a Semi-supervised Active learning Framework (SAF) is designed. This framework redistributes class labels to samples, which not only enhances the reliability of generated samples but also reduces the influence of noise and uncertainty on the generation of false labels. To mine the diversity information of generated samples, an uncertain sampling strategy based on spatial overlap is designed. This strategy incorporates the idea of spatial overlap and uses global and local sampling methods to calculate the information content of generated samples. Experimental results show that the proposed method performs better than other data enhancement methods on three different datasets. Compared to the original data, the average F 1 m a c r o value of the classification model is improved by 11.5%, 16.1%, and 19.6% relative to compared methods.

Keywords: few-shot learning; structured data augmentation; generative adversarial network; semi-supervised learning; active learning (search for similar items in EconPapers)
JEL-codes: C (search for similar items in EconPapers)
Date: 2024
References: View complete reference list from CitEc
Citations:

Downloads: (external link)
https://www.mdpi.com/2227-7390/12/17/2634/pdf (application/pdf)
https://www.mdpi.com/2227-7390/12/17/2634/ (text/html)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:gam:jmathe:v:12:y:2024:i:17:p:2634-:d:1463462

Access Statistics for this article

Mathematics is currently edited by Ms. Emma He

More articles in Mathematics from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().