Efficient posterior sampling for high-dimensional imbalanced logistic regression
Deborshee Sen,
Matthias Sachs,
Jianfeng Lu and
David B Dunson
Biometrika, vol. 107, issue 4, 1005-1012
Abstract:
SummaryClassification with high-dimensional data is of widespread interest and often involves dealing with imbalanced data. Bayesian classification approaches are hampered by the fact that current Markov chain Monte Carlo algorithms for posterior computation become inefficient as the number $p$ of predictors or the number $n$ of subjects to classify gets large, because of the increasing computational time per step and worsening mixing rates. One strategy is to employ a gradient-based sampler to improve mixing while using data subsamples to reduce the per-step computational complexity. However, the usual subsampling breaks down when applied to imbalanced data. Instead, we generalize piecewise-deterministic Markov chain Monte Carlo algorithms to include importance-weighted and mini-batch subsampling. These maintain the correct stationary distribution with arbitrarily small subsamples and substantially outperform current competitors. We provide theoretical support for the proposed approach and demonstrate its performance gains in simulated data examples and an application to cancer data.
Keywords: Imbalanced data; Logistic regression; Piecewise-deterministic Markov process; Scalable inference; Subsampling (search for similar items in EconPapers)
References: Add references at CitEc
Citations:
Downloads: (external link)
http://hdl.handle.net/10.1093/biomet/asaa035 (application/pdf)
Access to full text is restricted to subscribers.
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:oup:biomet:v:107:y::i:4:p:1005-1012.
Ordering information: This journal article can be ordered from
https://academic.oup.com/journals
Access Statistics for this article
Biometrika is currently edited by Paul Fearnhead
More articles in Biometrika from Biometrika Trust Oxford University Press, Great Clarendon Street, Oxford OX2 6DP, UK.
Bibliographic data for series maintained by Oxford University Press ().