TabFedSL: A Self-Supervised Approach to Labeling Tabular Data in Federated Learning Environments
Ruixiao Wang,
Yanxin Hu,
Zhiyu Chen,
Jianwei Guo and
Gang Liu ()
Additional contact information
Ruixiao Wang: School of Computer Science and Engineering, Changchun University of Technology, Changchun 130102, China
Yanxin Hu: School of Computer Science and Engineering, Changchun University of Technology, Changchun 130102, China
Zhiyu Chen: School of Computer Science and Engineering, Changchun University of Technology, Changchun 130102, China
Jianwei Guo: School of Computer Science and Engineering, Changchun University of Technology, Changchun 130102, China
Gang Liu: School of Computer Science and Engineering, Changchun University of Technology, Changchun 130102, China
Mathematics, 2024, vol. 12, issue 8, 1-20
Abstract:
Currently, self-supervised learning has shown effectiveness in solving data labeling issues. Its success mainly depends on having access to large, high-quality datasets with diverse features. It also relies on utilizing the spatial, temporal, and semantic structures present in the data. However, domains such as finance, healthcare, and insurance primarily utilize tabular data formats. This presents challenges for traditional data augmentation methods aimed at improving data quality. Furthermore, the privacy-sensitive nature of these domains complicates the acquisition of the extensive, high-quality datasets necessary for training effective self-supervised models. To tackle these challenges, our proposal introduces a novel framework that combines self-supervised learning with Federated Learning (FL). This approach aims to solve the problem of data-distributed training while ensuring training quality. Our framework improves upon the conventional self-supervised learning data augmentation paradigm by incorporating data labeling through the segmentation of data into subsets. Our framework adds noise by splitting subsets of data and can achieve the same level of centralized learning in a distributed environment. Moreover, we conduct experiments on various public tabular datasets to evaluate our approach. The experimental results showcase the effectiveness and generalizability of our proposed method in scenarios involving unlabeled data and distributed settings.
Keywords: Federated Learning; self-supervised learning; tabular data; deep learning (search for similar items in EconPapers)
JEL-codes: C (search for similar items in EconPapers)
Date: 2024
References: View complete reference list from CitEc
Citations:
Downloads: (external link)
https://www.mdpi.com/2227-7390/12/8/1158/pdf (application/pdf)
https://www.mdpi.com/2227-7390/12/8/1158/ (text/html)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:gam:jmathe:v:12:y:2024:i:8:p:1158-:d:1374418
Access Statistics for this article
Mathematics is currently edited by Ms. Emma He
More articles in Mathematics from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().