Alternating Minimization for Mixed Linear Regression

Xinyang Yi, Constantine Caramanis, Sujay Sanghavi
Proceedings of the 31st International Conference on Machine Learning, PMLR 32(2):613-621, 2014.

Abstract

Mixed linear regression involves the recovery of two (or more) unknown vectors from unlabeled linear measurements; that is, where each sample comes from exactly one of the vectors, but we do not know which one. It is a classic problem, and the natural and empirically most popular approach to its solution has been the EM algorithm. As in other settings, this is prone to bad local minima; however, each iteration is very fast (alternating between guessing labels, and solving with those labels). In this paper we provide a new initialization procedure for EM, based on finding the leading two eigenvectors of an appropriate matrix. We then show that with this, a re-sampled version of the EM algorithm provably converges to the correct vectors, under natural assumptions on the sampling distribution, and with nearly optimal (unimprovable) sample complexity. This provides not only the first characterization of EM’s performance, but also much lower sample complexity as compared to both standard (randomly initialized) EM, and other methods for this problem.

Cite this Paper


BibTeX
@InProceedings{pmlr-v32-yia14, title = {Alternating Minimization for Mixed Linear Regression}, author = {Yi, Xinyang and Caramanis, Constantine and Sanghavi, Sujay}, booktitle = {Proceedings of the 31st International Conference on Machine Learning}, pages = {613--621}, year = {2014}, editor = {Xing, Eric P. and Jebara, Tony}, volume = {32}, number = {2}, series = {Proceedings of Machine Learning Research}, address = {Bejing, China}, month = {22--24 Jun}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v32/yia14.pdf}, url = {https://proceedings.mlr.press/v32/yia14.html}, abstract = {Mixed linear regression involves the recovery of two (or more) unknown vectors from unlabeled linear measurements; that is, where each sample comes from exactly one of the vectors, but we do not know which one. It is a classic problem, and the natural and empirically most popular approach to its solution has been the EM algorithm. As in other settings, this is prone to bad local minima; however, each iteration is very fast (alternating between guessing labels, and solving with those labels). In this paper we provide a new initialization procedure for EM, based on finding the leading two eigenvectors of an appropriate matrix. We then show that with this, a re-sampled version of the EM algorithm provably converges to the correct vectors, under natural assumptions on the sampling distribution, and with nearly optimal (unimprovable) sample complexity. This provides not only the first characterization of EM’s performance, but also much lower sample complexity as compared to both standard (randomly initialized) EM, and other methods for this problem.} }
Endnote
%0 Conference Paper %T Alternating Minimization for Mixed Linear Regression %A Xinyang Yi %A Constantine Caramanis %A Sujay Sanghavi %B Proceedings of the 31st International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2014 %E Eric P. Xing %E Tony Jebara %F pmlr-v32-yia14 %I PMLR %P 613--621 %U https://proceedings.mlr.press/v32/yia14.html %V 32 %N 2 %X Mixed linear regression involves the recovery of two (or more) unknown vectors from unlabeled linear measurements; that is, where each sample comes from exactly one of the vectors, but we do not know which one. It is a classic problem, and the natural and empirically most popular approach to its solution has been the EM algorithm. As in other settings, this is prone to bad local minima; however, each iteration is very fast (alternating between guessing labels, and solving with those labels). In this paper we provide a new initialization procedure for EM, based on finding the leading two eigenvectors of an appropriate matrix. We then show that with this, a re-sampled version of the EM algorithm provably converges to the correct vectors, under natural assumptions on the sampling distribution, and with nearly optimal (unimprovable) sample complexity. This provides not only the first characterization of EM’s performance, but also much lower sample complexity as compared to both standard (randomly initialized) EM, and other methods for this problem.
RIS
TY - CPAPER TI - Alternating Minimization for Mixed Linear Regression AU - Xinyang Yi AU - Constantine Caramanis AU - Sujay Sanghavi BT - Proceedings of the 31st International Conference on Machine Learning DA - 2014/06/18 ED - Eric P. Xing ED - Tony Jebara ID - pmlr-v32-yia14 PB - PMLR DP - Proceedings of Machine Learning Research VL - 32 IS - 2 SP - 613 EP - 621 L1 - http://proceedings.mlr.press/v32/yia14.pdf UR - https://proceedings.mlr.press/v32/yia14.html AB - Mixed linear regression involves the recovery of two (or more) unknown vectors from unlabeled linear measurements; that is, where each sample comes from exactly one of the vectors, but we do not know which one. It is a classic problem, and the natural and empirically most popular approach to its solution has been the EM algorithm. As in other settings, this is prone to bad local minima; however, each iteration is very fast (alternating between guessing labels, and solving with those labels). In this paper we provide a new initialization procedure for EM, based on finding the leading two eigenvectors of an appropriate matrix. We then show that with this, a re-sampled version of the EM algorithm provably converges to the correct vectors, under natural assumptions on the sampling distribution, and with nearly optimal (unimprovable) sample complexity. This provides not only the first characterization of EM’s performance, but also much lower sample complexity as compared to both standard (randomly initialized) EM, and other methods for this problem. ER -
APA
Yi, X., Caramanis, C. & Sanghavi, S.. (2014). Alternating Minimization for Mixed Linear Regression. Proceedings of the 31st International Conference on Machine Learning, in Proceedings of Machine Learning Research 32(2):613-621 Available from https://proceedings.mlr.press/v32/yia14.html.

Related Material