Speaker: Alex Shen
Abstract: Understanding the relationship between training data and model outputs is a key challenge for model interpretability. I will start with a brief overview of training data influence analysis methods as categorized by [1], commenting on the challenges particular to deep neural network (and other black box) models. I will then share a proposed method [2] for estimating the influence of particular training points on the prediction of a new test point for neural network models. This approach draws on the famous representer theorem and a clever partitioning of the layers of the neural network. [1] https://arxiv.org/pdf/2212.04612.pdf [2] https://arxiv.org/pdf/1811.09720.pdf