There exist interpolation methods with optimal/near-optimal prediction risk under noise

Mar 19, 3pm, GHC 8102

Speaker: Veeru Sadhanala

Abstract: Given data (x_i,y_i), i=1..n, an interpolating estimator f fits f(x_i) = y_i. Until recently it has been folklore knowledge that such interpolators do not have good prediction risk properties in the presence of label/response noise. But recent empirical findings are surprising -- test error goes down with training error in some high-dimensional settings such as neural networks. A natural question arises -- can we show that interpolating methods have good prediction risk in theory as well? In the first paper above, a kernel smoothing interpolator is shown to have minimax optimal prediction risk when the true regression function is in a Holder class. In the second paper, two nearest-neighbor-like interpolators are shown to have near optimal prediction risk.

Papers: 1. Does data interpolation contradict statistical optimality?; 2. Overfitting or perfect fitting? Risk bounds for classification and regression rules that interpolate