EconPapers    
Economics at your fingertips  
 

Efficient Federated Learning with Pre-Trained Large Language Model Using Several Adapter Mechanisms

Gyunyeop Kim, Joon Yoo () and Sangwoo Kang ()
Additional contact information
Gyunyeop Kim: School of Computing, Gachon University, 1342, Seongnam-daero, Sujeong-gu, Seongnam-si 13120, Republic of Korea
Joon Yoo: School of Computing, Gachon University, 1342, Seongnam-daero, Sujeong-gu, Seongnam-si 13120, Republic of Korea
Sangwoo Kang: School of Computing, Gachon University, 1342, Seongnam-daero, Sujeong-gu, Seongnam-si 13120, Republic of Korea

Mathematics, 2023, vol. 11, issue 21, 1-19

Abstract: Recent advancements in deep learning have led to various challenges, one of which is the issue of data privacy in training data. To address this issue, federated learning, a technique that merges models trained by clients on servers, has emerged as an attractive solution. However, federated learning faces challenges related to data heterogeneity and system heterogeneity. Recent observations suggest that incorporating pre-trained models into federated learning can mitigate some of these challenges. Nonetheless, the main drawback of pre-trained models lies in their typically large model size, leading to excessive data transmission when clients send these models to the server. Additionally, federated learning involves multiple global steps, which means transmitting a large language model to multiple clients results in too much data exchange. In this paper, we propose a novel approach to address this challenge using adapters. Adapters demonstrate training efficiency by training a small capacity adapter layer alongside a large language model. This unique characteristic reduces the volume of data transmission, offering a practical solution to the problem. The evaluation results demonstrate that the proposed method achieves a reduction in training time of approximately 20–40% and a transmission speed improvement of over 98% compared to previous approaches.

Keywords: federated learning; deep learning; transfer learning; adapter transformer (search for similar items in EconPapers)
JEL-codes: C (search for similar items in EconPapers)
Date: 2023
References: View complete reference list from CitEc
Citations:

Downloads: (external link)
https://www.mdpi.com/2227-7390/11/21/4479/pdf (application/pdf)
https://www.mdpi.com/2227-7390/11/21/4479/ (text/html)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:gam:jmathe:v:11:y:2023:i:21:p:4479-:d:1269828

Access Statistics for this article

Mathematics is currently edited by Ms. Emma He

More articles in Mathematics from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().

 
Page updated 2025-03-19
Handle: RePEc:gam:jmathe:v:11:y:2023:i:21:p:4479-:d:1269828