Scaling MCMC Algorithms for Big Data
Chris Nemeth (University of Lancaster)
Friday 4th November, 2016 15:00-16:00 Maths 203
MCMC has become one of the most popular algorithms for analysing complex Bayesian models. Unfortunately, standard implementations of MCMC do not scale well to the 'big data' scenario, where millions of observations need to be evaluated at each iteration, thus making the algorithm prohibitively slow.
Recent works that address this problem can be broadly categorised as subsampling and divide-and-conquer techniques. I'll give a brief overview of both approaches and then focus on some recent work in collaboration with Chris Sherlock <https://arxiv.org/abs/1605.08576>, which utilises parallel processors to execute MCMC algorithms in parallel, where we split the data across multiple machines and run independent MCMC algorithms in parallel. We'll see how by using Gaussian processes, we can recombine the parallel posterior distributions to form the full posterior.