Global Convergence of Stochastic Gradient Descent for Some Non-convex Matrix Problems

Christopher De Sa, Christopher Re, Kunle Olukotun
Proceedings of the 32nd International Conference on Machine Learning, PMLR 37:2332-2341, 2015.

Abstract

Stochastic gradient descent (SGD) on a low-rank factorization is commonly employed to speed up matrix problems including matrix completion, subspace tracking, and SDP relaxation. In this paper, we exhibit a step size scheme for SGD on a low-rank least-squares problem, and we prove that, under broad sampling conditions, our method converges globally from a random starting point within O(ε^-1 n \log n) steps with constant probability for constant-rank problems. Our modification of SGD relates it to stochastic power iteration. We also show some experiments to illustrate the runtime and convergence of the algorithm.

Cite this Paper


BibTeX
@InProceedings{pmlr-v37-sa15, title = {Global Convergence of Stochastic Gradient Descent for Some Non-convex Matrix Problems}, author = {Sa, Christopher De and Re, Christopher and Olukotun, Kunle}, booktitle = {Proceedings of the 32nd International Conference on Machine Learning}, pages = {2332--2341}, year = {2015}, editor = {Bach, Francis and Blei, David}, volume = {37}, series = {Proceedings of Machine Learning Research}, address = {Lille, France}, month = {07--09 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v37/sa15.pdf}, url = {https://proceedings.mlr.press/v37/sa15.html}, abstract = {Stochastic gradient descent (SGD) on a low-rank factorization is commonly employed to speed up matrix problems including matrix completion, subspace tracking, and SDP relaxation. In this paper, we exhibit a step size scheme for SGD on a low-rank least-squares problem, and we prove that, under broad sampling conditions, our method converges globally from a random starting point within O(ε^-1 n \log n) steps with constant probability for constant-rank problems. Our modification of SGD relates it to stochastic power iteration. We also show some experiments to illustrate the runtime and convergence of the algorithm.} }
Endnote
%0 Conference Paper %T Global Convergence of Stochastic Gradient Descent for Some Non-convex Matrix Problems %A Christopher De Sa %A Christopher Re %A Kunle Olukotun %B Proceedings of the 32nd International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2015 %E Francis Bach %E David Blei %F pmlr-v37-sa15 %I PMLR %P 2332--2341 %U https://proceedings.mlr.press/v37/sa15.html %V 37 %X Stochastic gradient descent (SGD) on a low-rank factorization is commonly employed to speed up matrix problems including matrix completion, subspace tracking, and SDP relaxation. In this paper, we exhibit a step size scheme for SGD on a low-rank least-squares problem, and we prove that, under broad sampling conditions, our method converges globally from a random starting point within O(ε^-1 n \log n) steps with constant probability for constant-rank problems. Our modification of SGD relates it to stochastic power iteration. We also show some experiments to illustrate the runtime and convergence of the algorithm.
RIS
TY - CPAPER TI - Global Convergence of Stochastic Gradient Descent for Some Non-convex Matrix Problems AU - Christopher De Sa AU - Christopher Re AU - Kunle Olukotun BT - Proceedings of the 32nd International Conference on Machine Learning DA - 2015/06/01 ED - Francis Bach ED - David Blei ID - pmlr-v37-sa15 PB - PMLR DP - Proceedings of Machine Learning Research VL - 37 SP - 2332 EP - 2341 L1 - http://proceedings.mlr.press/v37/sa15.pdf UR - https://proceedings.mlr.press/v37/sa15.html AB - Stochastic gradient descent (SGD) on a low-rank factorization is commonly employed to speed up matrix problems including matrix completion, subspace tracking, and SDP relaxation. In this paper, we exhibit a step size scheme for SGD on a low-rank least-squares problem, and we prove that, under broad sampling conditions, our method converges globally from a random starting point within O(ε^-1 n \log n) steps with constant probability for constant-rank problems. Our modification of SGD relates it to stochastic power iteration. We also show some experiments to illustrate the runtime and convergence of the algorithm. ER -
APA
Sa, C.D., Re, C. & Olukotun, K.. (2015). Global Convergence of Stochastic Gradient Descent for Some Non-convex Matrix Problems. Proceedings of the 32nd International Conference on Machine Learning, in Proceedings of Machine Learning Research 37:2332-2341 Available from https://proceedings.mlr.press/v37/sa15.html.

Related Material