L1-regularized Neural Networks are Improperly Learnable in Polynomial Time

Yuchen Zhang, Jason D. Lee, Michael I. Jordan
Proceedings of The 33rd International Conference on Machine Learning, PMLR 48:993-1001, 2016.

Abstract

We study the improper learning of multi-layer neural networks. Suppose that the neural network to be learned has k hidden layers and that the \ell_1-norm of the incoming weights of any neuron is bounded by L. We present a kernel-based method, such that with probability at least 1 - δ, it learns a predictor whose generalization error is at most εworse than that of the neural network. The sample complexity and the time complexity of the presented method are polynomial in the input dimension and in (1/ε,\log(1/δ),F(k,L)), where F(k,L) is a function depending on (k,L) and on the activation function, independent of the number of neurons. The algorithm applies to both sigmoid-like activation functions and ReLU-like activation functions. It implies that any sufficiently sparse neural network is learnable in polynomial time.

Cite this Paper


BibTeX
@InProceedings{pmlr-v48-zhangd16, title = {L1-regularized Neural Networks are Improperly Learnable in Polynomial Time}, author = {Zhang, Yuchen and Lee, Jason D. and Jordan, Michael I.}, booktitle = {Proceedings of The 33rd International Conference on Machine Learning}, pages = {993--1001}, year = {2016}, editor = {Balcan, Maria Florina and Weinberger, Kilian Q.}, volume = {48}, series = {Proceedings of Machine Learning Research}, address = {New York, New York, USA}, month = {20--22 Jun}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v48/zhangd16.pdf}, url = {https://proceedings.mlr.press/v48/zhangd16.html}, abstract = {We study the improper learning of multi-layer neural networks. Suppose that the neural network to be learned has k hidden layers and that the \ell_1-norm of the incoming weights of any neuron is bounded by L. We present a kernel-based method, such that with probability at least 1 - δ, it learns a predictor whose generalization error is at most εworse than that of the neural network. The sample complexity and the time complexity of the presented method are polynomial in the input dimension and in (1/ε,\log(1/δ),F(k,L)), where F(k,L) is a function depending on (k,L) and on the activation function, independent of the number of neurons. The algorithm applies to both sigmoid-like activation functions and ReLU-like activation functions. It implies that any sufficiently sparse neural network is learnable in polynomial time.} }
Endnote
%0 Conference Paper %T L1-regularized Neural Networks are Improperly Learnable in Polynomial Time %A Yuchen Zhang %A Jason D. Lee %A Michael I. Jordan %B Proceedings of The 33rd International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2016 %E Maria Florina Balcan %E Kilian Q. Weinberger %F pmlr-v48-zhangd16 %I PMLR %P 993--1001 %U https://proceedings.mlr.press/v48/zhangd16.html %V 48 %X We study the improper learning of multi-layer neural networks. Suppose that the neural network to be learned has k hidden layers and that the \ell_1-norm of the incoming weights of any neuron is bounded by L. We present a kernel-based method, such that with probability at least 1 - δ, it learns a predictor whose generalization error is at most εworse than that of the neural network. The sample complexity and the time complexity of the presented method are polynomial in the input dimension and in (1/ε,\log(1/δ),F(k,L)), where F(k,L) is a function depending on (k,L) and on the activation function, independent of the number of neurons. The algorithm applies to both sigmoid-like activation functions and ReLU-like activation functions. It implies that any sufficiently sparse neural network is learnable in polynomial time.
RIS
TY - CPAPER TI - L1-regularized Neural Networks are Improperly Learnable in Polynomial Time AU - Yuchen Zhang AU - Jason D. Lee AU - Michael I. Jordan BT - Proceedings of The 33rd International Conference on Machine Learning DA - 2016/06/11 ED - Maria Florina Balcan ED - Kilian Q. Weinberger ID - pmlr-v48-zhangd16 PB - PMLR DP - Proceedings of Machine Learning Research VL - 48 SP - 993 EP - 1001 L1 - http://proceedings.mlr.press/v48/zhangd16.pdf UR - https://proceedings.mlr.press/v48/zhangd16.html AB - We study the improper learning of multi-layer neural networks. Suppose that the neural network to be learned has k hidden layers and that the \ell_1-norm of the incoming weights of any neuron is bounded by L. We present a kernel-based method, such that with probability at least 1 - δ, it learns a predictor whose generalization error is at most εworse than that of the neural network. The sample complexity and the time complexity of the presented method are polynomial in the input dimension and in (1/ε,\log(1/δ),F(k,L)), where F(k,L) is a function depending on (k,L) and on the activation function, independent of the number of neurons. The algorithm applies to both sigmoid-like activation functions and ReLU-like activation functions. It implies that any sufficiently sparse neural network is learnable in polynomial time. ER -
APA
Zhang, Y., Lee, J.D. & Jordan, M.I.. (2016). L1-regularized Neural Networks are Improperly Learnable in Polynomial Time. Proceedings of The 33rd International Conference on Machine Learning, in Proceedings of Machine Learning Research 48:993-1001 Available from https://proceedings.mlr.press/v48/zhangd16.html.

Related Material