Agnostic Learning of Disjunctions on Symmetric Distributions
Vitaly Feldman, Pravesh Kothari; 16(106):3455−3467, 2015.
Abstract
We consider the problem of approximating and learning disjunctions (or equivalently, conjunctions) on symmetric distributions over {0,1}n. Symmetric distributions are distributions whose PDF is invariant under any permutation of the variables. We prove that for every symmetric distribution D, there exists a set of nO(log(1/ϵ)) functions S, such that for every disjunction c, there is function p, expressible as a linear combination of functions in S, such that p ϵ-approximates c in ℓ1 distance on D or Ex∼D[|c(x)−p(x)|]≤ϵ. This implies an agnostic learning algorithm for disjunctions on symmetric distributions that runs in time nO(log(1/ϵ)). The best known previous bound is nO(1/ϵ4) and follows from approximation of the more general class of halfspaces (Wimmer, 2010). We also show that there exists a symmetric distribution D, such that the minimum degree of a polynomial that 1/3-approximates the disjunction of all n variables in ℓ1 distance on D is Ω(√n). Therefore the learning result above cannot be achieved via ℓ1-regression with a polynomial basis used in most other agnostic learning algorithms.
Our technique also gives a simple proof that for any product distribution D and every disjunction c, there exists a polynomial p of degree O(log(1/ϵ)) such that p ϵ-approximates c in ℓ1 distance on D. This was first proved by Blais et al. (2008) via a more involved argument.