Strategic hypothesis testing and making forecasts that are calibrated for arbitrary data sequences

21 Oct 2021, 2:00p - 3:30p, NSH 3305

Speaker: Chirag Gupta

Abstract: A number of recent calibration or uncertainty quantification techniques in machine learning rely on the assumption that the data is i.i.d. I will briefly summarize some of my contributions in this area. However, the majority of this talk will break away from this suspicious assumption. First, I will present Foster and Vohra's controversial 1998 result from the paper Asymptotic calibration (https://www.jstor.org/stable/2337364?seq=1#metadata_info_tab_contents), where they show that there exist forecasting strategies that are calibrated for every possible data sequence. I will sketch a proof using Blackwell's approachability theorem (https://msp.org/pjm/1956/6-1/pjm-v6-n1-p01-s.pdf). Foster and Vohra's result has developed into the broader field of 'strategic hypothesis testing'. In standard hypothesis testing, we implicitly assume that the hypothesis is borne out of 'thin air', without any hidden incentives for the hypothesis-generating-entity. The statistician's goal is to design testing algorithms that verify if the observations are coherent with the proposed hypothesis. However, the hypothesis-generating-entity in an ML/prediction setting is a probabilistic forecaster; such a forecaster naturally has strong incentives to pass tests. In strategic hypothesis testing, we give the forecaster access to the test, and ask if they can 'strategically' pass it no matter how the data is distributed. I will present a number of results (without proof) by Olszewski and Sandroni (2003–2011) showing that a large class of tests can be passed for every possible data sequence. The talk will closely follow Chapter 18 of the Handbook of Game Theory, Volume 4 (https://www.elsevier.com/books/handbook-of-game-theory/young/978-0-444-53766-9). A publicly available version is here (https://faculty.wcas.northwestern.edu/~wol737/Hand.pdf).