Learning Halfspaces Under Log-Concave Densities: Polynomial Approximations and Moment Matching

Daniel Kane, Adam Klivans, Raghu Meka
Proceedings of the 26th Annual Conference on Learning Theory, PMLR 30:522-545, 2013.

Abstract

We give the first polynomial-time algorithm for agnostically learning any function of a constant number of halfspaces with respect to any log-concave distribution (for any constant accuracy parameter). This result was not known even for the case of PAC learning the intersection of two halfspaces. We give two very different proofs of this result. The first develops a theory of polynomial approximation for log-concave measures and constructs a low-degree L_1 polynomial approximator for sufficiently smooth functions. The second uses techniques related to the classical moment problem to obtain sandwiching polynomials. Both approaches deviate significantly from known Fourier-based methods, where essentially all previous work required the underlying distribution to have some product structure. Additionally, we show that in the smoothed-analysis setting, the above results hold with respect to distributions that have sub-exponential tails, a property satisfied by many natural and well-studied distributions in machine learning.

Cite this Paper


BibTeX
@InProceedings{pmlr-v30-Kane13, title = {Learning Halfspaces Under Log-Concave Densities: Polynomial Approximations and Moment Matching}, author = {Kane, Daniel and Klivans, Adam and Meka, Raghu}, booktitle = {Proceedings of the 26th Annual Conference on Learning Theory}, pages = {522--545}, year = {2013}, editor = {Shalev-Shwartz, Shai and Steinwart, Ingo}, volume = {30}, series = {Proceedings of Machine Learning Research}, address = {Princeton, NJ, USA}, month = {12--14 Jun}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v30/Kane13.pdf}, url = {https://proceedings.mlr.press/v30/Kane13.html}, abstract = {We give the first polynomial-time algorithm for agnostically learning any function of a constant number of halfspaces with respect to any log-concave distribution (for any constant accuracy parameter). This result was not known even for the case of PAC learning the intersection of two halfspaces. We give two very different proofs of this result. The first develops a theory of polynomial approximation for log-concave measures and constructs a low-degree L_1 polynomial approximator for sufficiently smooth functions. The second uses techniques related to the classical moment problem to obtain sandwiching polynomials. Both approaches deviate significantly from known Fourier-based methods, where essentially all previous work required the underlying distribution to have some product structure. Additionally, we show that in the smoothed-analysis setting, the above results hold with respect to distributions that have sub-exponential tails, a property satisfied by many natural and well-studied distributions in machine learning.} }
Endnote
%0 Conference Paper %T Learning Halfspaces Under Log-Concave Densities: Polynomial Approximations and Moment Matching %A Daniel Kane %A Adam Klivans %A Raghu Meka %B Proceedings of the 26th Annual Conference on Learning Theory %C Proceedings of Machine Learning Research %D 2013 %E Shai Shalev-Shwartz %E Ingo Steinwart %F pmlr-v30-Kane13 %I PMLR %P 522--545 %U https://proceedings.mlr.press/v30/Kane13.html %V 30 %X We give the first polynomial-time algorithm for agnostically learning any function of a constant number of halfspaces with respect to any log-concave distribution (for any constant accuracy parameter). This result was not known even for the case of PAC learning the intersection of two halfspaces. We give two very different proofs of this result. The first develops a theory of polynomial approximation for log-concave measures and constructs a low-degree L_1 polynomial approximator for sufficiently smooth functions. The second uses techniques related to the classical moment problem to obtain sandwiching polynomials. Both approaches deviate significantly from known Fourier-based methods, where essentially all previous work required the underlying distribution to have some product structure. Additionally, we show that in the smoothed-analysis setting, the above results hold with respect to distributions that have sub-exponential tails, a property satisfied by many natural and well-studied distributions in machine learning.
RIS
TY - CPAPER TI - Learning Halfspaces Under Log-Concave Densities: Polynomial Approximations and Moment Matching AU - Daniel Kane AU - Adam Klivans AU - Raghu Meka BT - Proceedings of the 26th Annual Conference on Learning Theory DA - 2013/06/13 ED - Shai Shalev-Shwartz ED - Ingo Steinwart ID - pmlr-v30-Kane13 PB - PMLR DP - Proceedings of Machine Learning Research VL - 30 SP - 522 EP - 545 L1 - http://proceedings.mlr.press/v30/Kane13.pdf UR - https://proceedings.mlr.press/v30/Kane13.html AB - We give the first polynomial-time algorithm for agnostically learning any function of a constant number of halfspaces with respect to any log-concave distribution (for any constant accuracy parameter). This result was not known even for the case of PAC learning the intersection of two halfspaces. We give two very different proofs of this result. The first develops a theory of polynomial approximation for log-concave measures and constructs a low-degree L_1 polynomial approximator for sufficiently smooth functions. The second uses techniques related to the classical moment problem to obtain sandwiching polynomials. Both approaches deviate significantly from known Fourier-based methods, where essentially all previous work required the underlying distribution to have some product structure. Additionally, we show that in the smoothed-analysis setting, the above results hold with respect to distributions that have sub-exponential tails, a property satisfied by many natural and well-studied distributions in machine learning. ER -
APA
Kane, D., Klivans, A. & Meka, R.. (2013). Learning Halfspaces Under Log-Concave Densities: Polynomial Approximations and Moment Matching. Proceedings of the 26th Annual Conference on Learning Theory, in Proceedings of Machine Learning Research 30:522-545 Available from https://proceedings.mlr.press/v30/Kane13.html.

Related Material