Generalized Federated Learning via Gradient Norm-Aware Minimization and Control Variables

Xu, Yicheng; Ma, Wubin; Dai, Chaofan; Wu, Yahui; Zhou, Haohao

Generalized Federated Learning via Gradient Norm-Aware Minimization and Control Variables

Yicheng Xu, Wubin Ma (), Chaofan Dai, Yahui Wu and Haohao Zhou
Additional contact information
Yicheng Xu: College of Systems Engineering, National University of Defense Technology, Changsha 410073, China
Wubin Ma: College of Systems Engineering, National University of Defense Technology, Changsha 410073, China
Chaofan Dai: College of Systems Engineering, National University of Defense Technology, Changsha 410073, China
Yahui Wu: College of Systems Engineering, National University of Defense Technology, Changsha 410073, China
Haohao Zhou: College of Systems Engineering, National University of Defense Technology, Changsha 410073, China

Mathematics, 2024, vol. 12, issue 17, 1-19

Abstract: Federated Learning (FL) is a promising distributed machine learning framework that emphasizes privacy protection. However, inconsistencies between local optimization objectives and the global objective, commonly referred to as client drift, primarily arise due to non-independently and identically distributed (Non-IID) data, multiple local training steps, and partial client participation in training. The majority of current research tackling this challenge is mainly based on the empirical risk minimization (ERM) principle, while giving little consideration to the connection between the global loss landscape and generalization capability. This study proposes FedGAM, an innovative FL algorithm that incorporates Gradient Norm-Aware Minimization (GAM) to efficiently search for a local flat landscape. FedGAM specifically modifies the client model training objective to simultaneously minimize the loss value and first-order flatness, thereby seeking flat minima. To directly smooth the global flatness, we propose the more significant FedGAM-CV, which employs control variables to correct local updates, guiding each client to train models in a globally flat direction. Experiments on three datasets (CIFAR-10, MNIST, and FashionMNIST) demonstrate that our proposed algorithms outperform existing FL baselines, effectively finding flat minima and addressing the client drift problem.

Keywords: federated learning; client drift; distributed learning (search for similar items in EconPapers)
JEL-codes: C (search for similar items in EconPapers)
Date: 2024
References: View complete reference list from CitEc
Citations:

Downloads: (external link)
https://www.mdpi.com/2227-7390/12/17/2644/pdf (application/pdf)
https://www.mdpi.com/2227-7390/12/17/2644/ (text/html)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:gam:jmathe:v:12:y:2024:i:17:p:2644-:d:1464157

Access Statistics for this article

Mathematics is currently edited by Ms. Emma He

More articles in Mathematics from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().