Spectral Gap Error Bounds for Improving CUR Matrix Decomposition and the Nyström Method

David Anderson, Simon Du, Michael Mahoney, Christopher Melgaard, Kunming Wu, Ming Gu
Proceedings of the Eighteenth International Conference on Artificial Intelligence and Statistics, PMLR 38:19-27, 2015.

Abstract

The CUR matrix decomposition and the related Nyström method build low-rank approximations of data matrices by selecting a small number of representative rows and columns of the data. Here, we introduce novel \emphspectral gap error bounds that judiciously exploit the potentially rapid spectrum decay in the input matrix, a most common occurrence in machine learning and data analysis. Our error bounds are much tighter than existing ones for matrices with rapid spectrum decay, and they justify the use of a constant amount of oversampling relative to the rank parameter k, i.e, when the number of columns/rows is \ell=k+ O(1). We demonstrate our analysis on a novel deterministic algorithm, \emphStableCUR, which additionally eliminates a previously unrecognized source of potential instability in CUR decompositions. While our algorithm accepts any method of row and column selection, we implement it with a recent column selection scheme with strong singular value bounds. Empirical results on various classes of real world data matrices demonstrate that our algorithm is as efficient as and often outperforms competing algorithms.

Cite this Paper


BibTeX
@InProceedings{pmlr-v38-anderson15, title = {{Spectral Gap Error Bounds for Improving CUR Matrix Decomposition and the Nystr\"{o}m Method}}, author = {Anderson, David and Du, Simon and Mahoney, Michael and Melgaard, Christopher and Wu, Kunming and Gu, Ming}, booktitle = {Proceedings of the Eighteenth International Conference on Artificial Intelligence and Statistics}, pages = {19--27}, year = {2015}, editor = {Lebanon, Guy and Vishwanathan, S. V. N.}, volume = {38}, series = {Proceedings of Machine Learning Research}, address = {San Diego, California, USA}, month = {09--12 May}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v38/anderson15.pdf}, url = {https://proceedings.mlr.press/v38/anderson15.html}, abstract = {The CUR matrix decomposition and the related Nyström method build low-rank approximations of data matrices by selecting a small number of representative rows and columns of the data. Here, we introduce novel \emphspectral gap error bounds that judiciously exploit the potentially rapid spectrum decay in the input matrix, a most common occurrence in machine learning and data analysis. Our error bounds are much tighter than existing ones for matrices with rapid spectrum decay, and they justify the use of a constant amount of oversampling relative to the rank parameter k, i.e, when the number of columns/rows is \ell=k+ O(1). We demonstrate our analysis on a novel deterministic algorithm, \emphStableCUR, which additionally eliminates a previously unrecognized source of potential instability in CUR decompositions. While our algorithm accepts any method of row and column selection, we implement it with a recent column selection scheme with strong singular value bounds. Empirical results on various classes of real world data matrices demonstrate that our algorithm is as efficient as and often outperforms competing algorithms.} }
Endnote
%0 Conference Paper %T Spectral Gap Error Bounds for Improving CUR Matrix Decomposition and the Nyström Method %A David Anderson %A Simon Du %A Michael Mahoney %A Christopher Melgaard %A Kunming Wu %A Ming Gu %B Proceedings of the Eighteenth International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2015 %E Guy Lebanon %E S. V. N. Vishwanathan %F pmlr-v38-anderson15 %I PMLR %P 19--27 %U https://proceedings.mlr.press/v38/anderson15.html %V 38 %X The CUR matrix decomposition and the related Nyström method build low-rank approximations of data matrices by selecting a small number of representative rows and columns of the data. Here, we introduce novel \emphspectral gap error bounds that judiciously exploit the potentially rapid spectrum decay in the input matrix, a most common occurrence in machine learning and data analysis. Our error bounds are much tighter than existing ones for matrices with rapid spectrum decay, and they justify the use of a constant amount of oversampling relative to the rank parameter k, i.e, when the number of columns/rows is \ell=k+ O(1). We demonstrate our analysis on a novel deterministic algorithm, \emphStableCUR, which additionally eliminates a previously unrecognized source of potential instability in CUR decompositions. While our algorithm accepts any method of row and column selection, we implement it with a recent column selection scheme with strong singular value bounds. Empirical results on various classes of real world data matrices demonstrate that our algorithm is as efficient as and often outperforms competing algorithms.
RIS
TY - CPAPER TI - Spectral Gap Error Bounds for Improving CUR Matrix Decomposition and the Nyström Method AU - David Anderson AU - Simon Du AU - Michael Mahoney AU - Christopher Melgaard AU - Kunming Wu AU - Ming Gu BT - Proceedings of the Eighteenth International Conference on Artificial Intelligence and Statistics DA - 2015/02/21 ED - Guy Lebanon ED - S. V. N. Vishwanathan ID - pmlr-v38-anderson15 PB - PMLR DP - Proceedings of Machine Learning Research VL - 38 SP - 19 EP - 27 L1 - http://proceedings.mlr.press/v38/anderson15.pdf UR - https://proceedings.mlr.press/v38/anderson15.html AB - The CUR matrix decomposition and the related Nyström method build low-rank approximations of data matrices by selecting a small number of representative rows and columns of the data. Here, we introduce novel \emphspectral gap error bounds that judiciously exploit the potentially rapid spectrum decay in the input matrix, a most common occurrence in machine learning and data analysis. Our error bounds are much tighter than existing ones for matrices with rapid spectrum decay, and they justify the use of a constant amount of oversampling relative to the rank parameter k, i.e, when the number of columns/rows is \ell=k+ O(1). We demonstrate our analysis on a novel deterministic algorithm, \emphStableCUR, which additionally eliminates a previously unrecognized source of potential instability in CUR decompositions. While our algorithm accepts any method of row and column selection, we implement it with a recent column selection scheme with strong singular value bounds. Empirical results on various classes of real world data matrices demonstrate that our algorithm is as efficient as and often outperforms competing algorithms. ER -
APA
Anderson, D., Du, S., Mahoney, M., Melgaard, C., Wu, K. & Gu, M.. (2015). Spectral Gap Error Bounds for Improving CUR Matrix Decomposition and the Nyström Method. Proceedings of the Eighteenth International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 38:19-27 Available from https://proceedings.mlr.press/v38/anderson15.html.

Related Material