A Subspace Learning Approach for High Dimensional Matrix Decomposition with Efficient Column/Row Sampling

Mostafa Rahmani, Geroge Atia
Proceedings of The 33rd International Conference on Machine Learning, PMLR 48:1206-1214, 2016.

Abstract

This paper presents a new randomized approach to high-dimensional low rank (LR) plus sparse matrix decomposition. For a data matrix D ∈R^N_1 \times N_2, the complexity of conventional decomposition methods is O(N_1 N_2 r), which limits their usefulness in big data settings (r is the rank of the LR component). In addition, the existing randomized approaches rely for the most part on uniform random sampling, which may be inefficient for many real world data matrices. The proposed subspace learning based approach recovers the LR component using only a small subset of the columns/rows of data and reduces complexity to O(\max(N_1,N_2) r^2). Even when the columns/rows are sampled uniformly at random, the sufficient number of sampled columns/rows is shown to be roughly O(r μ), where μis the coherency parameter of the LR component. In addition, efficient sampling algorithms are proposed to address the problem of column/row sampling from structured data.

Cite this Paper


BibTeX
@InProceedings{pmlr-v48-rahmani16, title = {A Subspace Learning Approach for High Dimensional Matrix Decomposition with Efficient Column/Row Sampling}, author = {Rahmani, Mostafa and Atia, Geroge}, booktitle = {Proceedings of The 33rd International Conference on Machine Learning}, pages = {1206--1214}, year = {2016}, editor = {Balcan, Maria Florina and Weinberger, Kilian Q.}, volume = {48}, series = {Proceedings of Machine Learning Research}, address = {New York, New York, USA}, month = {20--22 Jun}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v48/rahmani16.pdf}, url = {https://proceedings.mlr.press/v48/rahmani16.html}, abstract = {This paper presents a new randomized approach to high-dimensional low rank (LR) plus sparse matrix decomposition. For a data matrix D ∈R^N_1 \times N_2, the complexity of conventional decomposition methods is O(N_1 N_2 r), which limits their usefulness in big data settings (r is the rank of the LR component). In addition, the existing randomized approaches rely for the most part on uniform random sampling, which may be inefficient for many real world data matrices. The proposed subspace learning based approach recovers the LR component using only a small subset of the columns/rows of data and reduces complexity to O(\max(N_1,N_2) r^2). Even when the columns/rows are sampled uniformly at random, the sufficient number of sampled columns/rows is shown to be roughly O(r μ), where μis the coherency parameter of the LR component. In addition, efficient sampling algorithms are proposed to address the problem of column/row sampling from structured data.} }
Endnote
%0 Conference Paper %T A Subspace Learning Approach for High Dimensional Matrix Decomposition with Efficient Column/Row Sampling %A Mostafa Rahmani %A Geroge Atia %B Proceedings of The 33rd International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2016 %E Maria Florina Balcan %E Kilian Q. Weinberger %F pmlr-v48-rahmani16 %I PMLR %P 1206--1214 %U https://proceedings.mlr.press/v48/rahmani16.html %V 48 %X This paper presents a new randomized approach to high-dimensional low rank (LR) plus sparse matrix decomposition. For a data matrix D ∈R^N_1 \times N_2, the complexity of conventional decomposition methods is O(N_1 N_2 r), which limits their usefulness in big data settings (r is the rank of the LR component). In addition, the existing randomized approaches rely for the most part on uniform random sampling, which may be inefficient for many real world data matrices. The proposed subspace learning based approach recovers the LR component using only a small subset of the columns/rows of data and reduces complexity to O(\max(N_1,N_2) r^2). Even when the columns/rows are sampled uniformly at random, the sufficient number of sampled columns/rows is shown to be roughly O(r μ), where μis the coherency parameter of the LR component. In addition, efficient sampling algorithms are proposed to address the problem of column/row sampling from structured data.
RIS
TY - CPAPER TI - A Subspace Learning Approach for High Dimensional Matrix Decomposition with Efficient Column/Row Sampling AU - Mostafa Rahmani AU - Geroge Atia BT - Proceedings of The 33rd International Conference on Machine Learning DA - 2016/06/11 ED - Maria Florina Balcan ED - Kilian Q. Weinberger ID - pmlr-v48-rahmani16 PB - PMLR DP - Proceedings of Machine Learning Research VL - 48 SP - 1206 EP - 1214 L1 - http://proceedings.mlr.press/v48/rahmani16.pdf UR - https://proceedings.mlr.press/v48/rahmani16.html AB - This paper presents a new randomized approach to high-dimensional low rank (LR) plus sparse matrix decomposition. For a data matrix D ∈R^N_1 \times N_2, the complexity of conventional decomposition methods is O(N_1 N_2 r), which limits their usefulness in big data settings (r is the rank of the LR component). In addition, the existing randomized approaches rely for the most part on uniform random sampling, which may be inefficient for many real world data matrices. The proposed subspace learning based approach recovers the LR component using only a small subset of the columns/rows of data and reduces complexity to O(\max(N_1,N_2) r^2). Even when the columns/rows are sampled uniformly at random, the sufficient number of sampled columns/rows is shown to be roughly O(r μ), where μis the coherency parameter of the LR component. In addition, efficient sampling algorithms are proposed to address the problem of column/row sampling from structured data. ER -
APA
Rahmani, M. & Atia, G.. (2016). A Subspace Learning Approach for High Dimensional Matrix Decomposition with Efficient Column/Row Sampling. Proceedings of The 33rd International Conference on Machine Learning, in Proceedings of Machine Learning Research 48:1206-1214 Available from https://proceedings.mlr.press/v48/rahmani16.html.

Related Material