Generalized Federated Learning via Gradient Norm-Aware Minimization and Control Variables
Yicheng Xu,
Wubin Ma (),
Chaofan Dai,
Yahui Wu and
Haohao Zhou
Additional contact information
Yicheng Xu: College of Systems Engineering, National University of Defense Technology, Changsha 410073, China
Wubin Ma: College of Systems Engineering, National University of Defense Technology, Changsha 410073, China
Chaofan Dai: College of Systems Engineering, National University of Defense Technology, Changsha 410073, China
Yahui Wu: College of Systems Engineering, National University of Defense Technology, Changsha 410073, China
Haohao Zhou: College of Systems Engineering, National University of Defense Technology, Changsha 410073, China
Mathematics, 2024, vol. 12, issue 17, 1-19
Abstract:
Federated Learning (FL) is a promising distributed machine learning framework that emphasizes privacy protection. However, inconsistencies between local optimization objectives and the global objective, commonly referred to as client drift, primarily arise due to non-independently and identically distributed (Non-IID) data, multiple local training steps, and partial client participation in training. The majority of current research tackling this challenge is mainly based on the empirical risk minimization (ERM) principle, while giving little consideration to the connection between the global loss landscape and generalization capability. This study proposes FedGAM, an innovative FL algorithm that incorporates Gradient Norm-Aware Minimization (GAM) to efficiently search for a local flat landscape. FedGAM specifically modifies the client model training objective to simultaneously minimize the loss value and first-order flatness, thereby seeking flat minima. To directly smooth the global flatness, we propose the more significant FedGAM-CV, which employs control variables to correct local updates, guiding each client to train models in a globally flat direction. Experiments on three datasets (CIFAR-10, MNIST, and FashionMNIST) demonstrate that our proposed algorithms outperform existing FL baselines, effectively finding flat minima and addressing the client drift problem.
Keywords: federated learning; client drift; distributed learning (search for similar items in EconPapers)
JEL-codes: C (search for similar items in EconPapers)
Date: 2024
References: View complete reference list from CitEc
Citations:
Downloads: (external link)
https://www.mdpi.com/2227-7390/12/17/2644/pdf (application/pdf)
https://www.mdpi.com/2227-7390/12/17/2644/ (text/html)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:gam:jmathe:v:12:y:2024:i:17:p:2644-:d:1464157
Access Statistics for this article
Mathematics is currently edited by Ms. Emma He
More articles in Mathematics from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().