Communication-efficient distributed estimator for generalized linear models with a diverging number of covariates
Ping Zhou,
Zhen Yu,
Jingyi Ma,
Maozai Tian and
Ye Fan
Computational Statistics & Data Analysis, 2021, vol. 157, issue C
Abstract:
Nowadays, it has become increasingly common to store large-scale data sets distributedly across a great number of clients. The aim of the study is to develop a distributed estimator for generalized linear models (GLMs) in the “large n, diverging pn” framework with a weak assumption on the number of clients. When the dimension diverges at the rate of o(n), the asymptotic efficiency of the global maximum likelihood estimator (MLE), the one-step MLE, and the aggregated estimating equation (AEE) estimator for GLMs are established. A novel distributed estimator is then proposed with two rounds of communication. It has the same asymptotic efficiency as the global MLE under pn=o(n). The assumption on the number of clients is more relaxed than that of the AEE estimator and the proposed method is thus more practical for real-world applications. Simulations and a case study demonstrate the satisfactory finite-sample performance of the proposed estimator.
Keywords: Generalized linear models; Large-scale distributed data; Asymptotic efficiency; One-step MLE; Diverging p (search for similar items in EconPapers)
Date: 2021
References: View references in EconPapers View complete reference list from CitEc
Citations:
Downloads: (external link)
http://www.sciencedirect.com/science/article/pii/S0167947320302450
Full text for ScienceDirect subscribers only.
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:eee:csdana:v:157:y:2021:i:c:s0167947320302450
DOI: 10.1016/j.csda.2020.107154
Access Statistics for this article
Computational Statistics & Data Analysis is currently edited by S.P. Azen
More articles in Computational Statistics & Data Analysis from Elsevier
Bibliographic data for series maintained by Catherine Liu ().