Interpolation in Regression and Classification

Sara van de Geer / Cornell University

Abstract: We consider a linear regression model with p variables and n observations, and with p much larger than n. There are then (typically) several interpolators of the data. It is well-known that the minimum l2-norm interpolator has a large bias. When the model is sparse, the minimum l1-norm interpolator can have l2-error about as small as the noise level. We will discuss this and then look at the case where one only observes the sign of the regression. We (re-)establish rates of convergence for two minimum l1-norm interpolators of the signs: the first one is under an l2-restriction and the second one is the max-margin classifier related to the ada-boost algorithm.

Bio: Sara van de Geer is since 2005 full professor at the ETH Zürich. Her main areas of research are empirical process theory, statistical learning theory, and nonparametric and high-dimensional statistics. She is past-president of the Bernoulli Society. She is an associate editor of the journals Electronic Journal of Statistics, Information and Inference, Journal of the European Mathematical Society, Mathematical Statistics and Learning and Statistics Surveys. She is a correspondent of the Dutch Royal Academy of Sciences, Knight in the Order of Orange-Nassau, Member of Leopoldina, Deutsche Akademie der Naturforscher and Member of Academia Europaea. She was invited speaker at the International Conference of Mathematicians in 2010.