The Optimality of Polynomial Regression for Agnostic Learning under Gaussian Marginals

Ilias Diakonikolas , Daniel M Kane , Thanasis Pittas , Nikos Zarifis

[Proceedings link] [PDF]

Session: Generalization and PAC-Learning 2 (B)

Session Chair: Steve Hanneke

Poster: Poster Session 4

Abstract: We study the problem of agnostic learning under the Gaussian distribution in the Statistical Query (SQ) model. We develop a method for finding hard families of examples for a wide range of concept classes by using LP duality. For Boolean-valued concept classes, we show that the $L^1$-polynomial regression algorithm is essentially best possible among SQ algorithms, and therefore that the SQ complexity of agnostic learning is closely related to the polynomial degree required to approximate any function from the concept class in $L^1$-norm. Using this characterization along with additional analytic tools, we obtain explicit optimal SQ lower bounds for agnostically learning linear threshold functions and the first non-trivial explicit SQ lower bounds for polynomial threshold functions and intersections of halfspaces. We also develop an analogous theory for agnostically learning real-valued functions, and as an application prove near-optimal SQ lower bounds for agnostically learning ReLUs and sigmoids.

Summary presentation

Full presentation