EconPapers    
Economics at your fingertips  
 

Efficient fused learning for distributed imbalanced data

Jie Zhou, Guohao Shen, Xuan Chen and Yuanyuan Lin

Communications in Statistics - Theory and Methods, 2022, vol. 51, issue 5, 1306-1317

Abstract: Any data set exhibiting an unequal or highly-skewed distribution between its classes/categories can be regarded as imbalanced data. Due to privacy concern and other technical limitations, imbalanced data distributed across locations/machines cannot be simply combined and stored in a single central location. The commonly used naive averaging estimate may be unstable for imbalanced data. In this paper, we propose a fused estimation for logistic regression in analyzing distributed imbalanced data by combining all the cases available on all machines, which is stable and efficient. The consistency and asymptotic normality of the proposed estimator are established under regularity conditions. Asymptotic efficiency compared with the oracle estimator based on the entire imbalanced data is also studied. Extensive simulation studies show that the proposed estimator is as efficient as the oracle estimator in various situations. An application is illustrated with a credit card data for default payment.

Date: 2022
References: Add references at CitEc
Citations:

Downloads: (external link)
http://hdl.handle.net/10.1080/03610926.2020.1759641 (text/html)
Access to full text is restricted to subscribers.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:taf:lstaxx:v:51:y:2022:i:5:p:1306-1317

Ordering information: This journal article can be ordered from
http://www.tandfonline.com/pricing/journal/lsta20

DOI: 10.1080/03610926.2020.1759641

Access Statistics for this article

Communications in Statistics - Theory and Methods is currently edited by Debbie Iscoe

More articles in Communications in Statistics - Theory and Methods from Taylor & Francis Journals
Bibliographic data for series maintained by Chris Longhurst ().

 
Page updated 2025-03-20
Handle: RePEc:taf:lstaxx:v:51:y:2022:i:5:p:1306-1317