Knockoffs and the model-X framework for high-dimensional variable selection

Oct 08, 3pm, GHC 8102

Speaker: Gene Katsevich

Abstract: High-dimensional variable selection is a notoriously difficult problem that statisticians have been grappling with for decades. A recently proposed and increasingly popular methodology in this realm is knockoffs, which can be viewed as a wrapper around any variable selection methodology (like the lasso) that endows it with rigorous Type-I error guarantees. It is based on the idea that a feature statistic like a lasso coefficient can be calibrated with the help of negative control variables (knockoffs), carefully constructed to be pairwise exchangeable with the original variables. The validity of the resulting Type-I error guarantees hinges on the model-X assumption---that the joint distribution of the covariates is known. This assumption is particularly well-suited to the application of genome-wide association studies. I will present the knockoffs framework, the model-X assumption, and the application to genome-wide association studies. If time permits, I will also discuss more recent works on constructing knockoffs for general classes of covariate distributions.