Speaker: Andrej Risteski
Abstract: A key task in Bayesian approaches in machine learning is sampling from distributions of the form p(x) = e^{-f(x)}/Z where Z is a normalizing constant, for some function f whose values and gradients we can query. One prevalent example of this is sampling posteriors in parametric distributions, such as latent-variable generative models -- which is the natural Bayesian analogue of clustering. However sampling (even very approximately) can be #P-hard. Classical results on sampling focus on log-concave distributions (i.e. f is convex), and show a natural Markov process called Langevin diffusion mixes in polynomial time. However, log-concavity is quite restrictive in practice: in particular such distributions are uni-modal. I will address in this talk some ways to move beyond this setup.