Speaker: Aaditya Ramdas
Abstract: In a wide variety of applications in science and technology, data arrives sequentially (think one subject at a time in a psychology study, or one user at a time on a website) and the data is often frequently monitored to estimate certain quantities of interest or test hypotheses about them. Unfortunately, p-values and confidence intervals developed in the batch setting (to be constructed once when the number of data points is specified in advance) are incorrect and invalid when they are constructed and peeked at repeatedly. The sub-field of anytime-valid inference tries to construct p-values or confidence intervals that are valid simultaneously at all times, including data-dependent stopping times. I will give examples of such constructions for various functionals of distributions, like means and quantiles, that are valid under nonparametric conditions and are optimal in certain senses. The main theoretical quantities of interest are some very interesting nonnegative (super)martingales. There are a host of fascinating open research questions surrounding adaptivity to parametric assumptions or “simpler” distributions, efficiency relative to an oracle fixed-sample batch method, and expanding the types of functionals for which we can construct tight (including constants) and practical intervals or p-values that are not too conservative. I’ll try to cover a bit of everything: applications, theory, methods, software. This topic forms one central aspect of the SAVI workshop next summer http://stat.cmu.edu/~aramdas/SAVI/savi20.html.
References: Main publications along this theme are: 1. Sequential estimation of quantiles with applications to A/B-testing and best-arm identification (S. Howard, A. Ramdas); 2. Uniform, nonparametric, nonasymptotic confidence sequences (S. Howard, A. Ramdas, J. Sekhon, J. McAuliffe); 3. Exponential line-crossing inequalities (S. Howard, A. Ramdas, J. Sekhon, J. McAuliffe)