Speaker: Simon Du
Abstract: We study what dataset assumption permits solving offline multi-agent games. In stark contrast to the offline single-agent Markov decision process, we show that the single strategy concentration assumption is insufficient for learning the Nash equilibrium (NE) strategy in offline two-player zero-sum Markov games. On the other hand, we propose a new assumption named unilateral concentration and design a pessimism-type algorithm that is provably efficient under this assumption. We further show that the unilateral concentration assumption is necessary for learning an NE strategy and can be generalized to multi-agent general-sum games. Lastly, we consider offline congestion games and show different feedback types require qualitatively different dataset coverage conditions.