Near-optimal minimax subsampling for low-dimensional linear regression

Speaker: Yining Wang

Abstract: I will talk about my recent work on subsampled linear regression. The goal is to randomly subsample a small portion of data (design points) for regression with reduced sample complexity. Our algorithms apply to both problems of estimation of the underlying linear model β and predicting the real-valued response y of a new data point x. The derived subsampling strategies are minimax optimal under the fixed design setting, up to a small (1 + eps) relative factor. We also give interpretable subsampling probabilities for the random design setting and demonstrate explicit gaps in statistical rates between optimal and baseline (e.g., uniform) subsampling methods.

Reference papers:
1. A statistical perspective on algorithmic leveraging. JMLR'15
2. Minimax Subsampling for estimation and prediction in low-dimensional linear regression. arXiv:1601.02068
3. Convergence rates of active learning for maximum likelihood estimation. NIPS'15