Privacy in Federated Learning Natural Language Models
Phung Lai () and
C. Ariel Pinto ()
Additional contact information
Phung Lai: SUNY-Albany
C. Ariel Pinto: SUNY-Albany
A chapter in Handbook of Trustworthy Federated Learning, 2025, pp 259-287 from Springer
Abstract:
Abstract It has become common to publish large language models that have been trained on private datasets. However, large language models can memorize and leak individual training examples, which severely affects the privacy and security of private datasets. In this chapter, we will discuss training language models in Federated Learning and its privacy and security challenges of the training process. We introduce a novel concept of user-entity differential privacy (UeDP) to provide formal privacy protection simultaneously to both sensitive entities in textual data and data owners in learning natural language models (NLMs). To preserve UeDP, we developed a novel algorithm, called UeDP-Alg, optimizing the trade-off between privacy loss and model utility with a tight sensitivity bound derived from seamlessly combining user and sensitive entity sampling processes. An extensive theoretical analysis and evaluation show that our UeDP-Alg outperforms baseline approaches in model utility under the same privacy budget consumption on several NLM tasks, using benchmark datasets. The chapter will continue with discussion about extending UeDP to solve privacy problems in training large language models, including Federated Learning.
Date: 2025
References: Add references at CitEc
Citations:
There are no downloads for this item, see the EconPapers FAQ for hints about obtaining it.
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:spr:spochp:978-3-031-58923-2_9
Ordering information: This item can be ordered from
http://www.springer.com/9783031589232
DOI: 10.1007/978-3-031-58923-2_9
Access Statistics for this chapter
More chapters in Springer Optimization and Its Applications from Springer
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().