Informed Sub-Sampling MCMC: Approximate Bayesian Inference for Large Datasets
Florian Maire (),
Nial Friel () and
Pierre Alquier ()
Additional contact information
Florian Maire: School of Mathematics and Statistics, University College Dublin; Insight Centre for Data Analytics, University College Dublin
Nial Friel: School of Mathematics and Statistics, University College Dublin; Insight Centre for Data Analytics, University College Dublin
Pierre Alquier: CREST-ENSAE
No 2017-40, Working Papers from Center for Research in Economics and Statistics
Abstract:
This paper introduces a framework for speeding up Bayesian inference conducted in presence of large datasets. We design a Markov chain whose transition kernel uses an unknown fraction of fixed size of the available data that is randomly refreshed throughout the algorithm. Inspired by the Approximate Bayesian Computation (ABC) literature, the subsampling process is guided by the fidelity to the observed data, as measured by summary statistics. The resulting algorithm, Informed Sub-Sampling MCMC, is a generic and exible approach which, contrarily to existing scalable methodologies, preserves the simplicity of the Metropolis-Hastings algorithm. Even though exactness is lost, i.e the chain distribution approximates the target, we study and quantify theoretically this bias and show on a diverse set of examples that it yields excellent performances when the computational budget is limited. If available and cheap to compute, we show that setting the summary statistics as the maximum likelihood estimator is supported by theoretical arguments.
Keywords: Bayesian inference; Big-data; Approximate Bayesian Computation; noisy Markov chain Monte Carlo (search for similar items in EconPapers)
Pages: 41 pages
Date: 2017-06-26
New Economics Papers: this item is included in nep-big and nep-ecm
References: View references in EconPapers View complete reference list from CitEc
Citations:
Downloads: (external link)
http://crest.science/RePEc/wpstorage/2017-40.pdf CREST working paper version (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:crs:wpaper:2017-40
Access Statistics for this paper
More papers in Working Papers from Center for Research in Economics and Statistics Contact information at EDIRC.
Bibliographic data for series maintained by Secretariat General () and Murielle Jules Maintainer-Email : murielle.jules@ensae.Fr.