Statistics and Machine Learning Working Group: People

An optimization-based approach to uncertainty quantification

26 Apr, 2023, 2:30-4:00 pm, GHC 8102

Speaker: Kayla Scharfstein

Abstract: In recent years, conformal prediction has exploded in popularity as a means for quantifying the uncertainty in black-box machine learning model predictions without making idealized distributional assumptions. I will discuss a different optimization-based approach that relies on minimal distributional assumptions to guarantee appropriate coverage. In particular, I will focus on a recent paper entitled “Universal prediction band via semi-definite programming” by Tengyuan Liang. Constructing the proposed prediction band requires solving a convex optimization program which simultaneously learns the conditional mean and variance functions while trading off their complexities. This optimization program has connections to many others in the literature, including sum-of-squares optimization, phase retrieval, min-norm interpolation, kernel ridge regression, and support vector regression. Tools from empirical process theory are used to show that the resulting prediction band has the correct coverage – this differs from the analysis of conformal prediction intervals which relies on an exchangeability argument. I will discuss pros and cons of this optimization-based approach compared to conformal prediction. Time permitting, I will briefly mention some new work with Arun on the related problem of constructing distribution-free prediction sets with valid coverage when the sample size is not known/fixed in advance and may depend on the observed data.