Online PCA with Spectral Bounds

Zohar Karnin, Edo Liberty
Proceedings of The 28th Conference on Learning Theory, PMLR 40:1129-1140, 2015.

Abstract

This paper revisits the online PCA problem. Given a stream of n vectors x_t ∈\mathbbR^d (columns of X) the algorithm must output y_t ∈\mathbbR^\ell (columns of Y) before receiving x_t+1. The goal of online PCA is to simultaneously minimize the target dimension \ell and the error \|X - (XY^\scriptstyle \textrm +)Y\|^2. We describe two simple and deterministic algorithms. The first, receives a parameter ∆and guarantees that \|X - (XY^\scriptstyle \textrm +)Y\|^2 is not significantly larger than ∆. It requires a target dimension of \ell = O(k/ε) for any k,εsuch that ∆\ge ε\sigma_1^2 + \sigma_k+1^2, with \sigma_i being the i’th singular value of X. The second receives k and εand guarantees that \|X - (XY^\scriptstyle \textrm +)Y\|^2 \le ε\sigma_1^2 + \sigma_k+1^2. It requires a target dimension of O( k\log n/ε^2). Different models and algorithms for Online PCA were considered in the past. This is the first that achieves a bound on the spectral norm of the residual matrix.

Cite this Paper


BibTeX
@InProceedings{pmlr-v40-Karnin15, title = {Online {PCA} with Spectral Bounds}, author = {Karnin, Zohar and Liberty, Edo}, booktitle = {Proceedings of The 28th Conference on Learning Theory}, pages = {1129--1140}, year = {2015}, editor = {Grünwald, Peter and Hazan, Elad and Kale, Satyen}, volume = {40}, series = {Proceedings of Machine Learning Research}, address = {Paris, France}, month = {03--06 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v40/Karnin15.pdf}, url = {https://proceedings.mlr.press/v40/Karnin15.html}, abstract = {This paper revisits the online PCA problem. Given a stream of n vectors x_t ∈\mathbbR^d (columns of X) the algorithm must output y_t ∈\mathbbR^\ell (columns of Y) before receiving x_t+1. The goal of online PCA is to simultaneously minimize the target dimension \ell and the error \|X - (XY^\scriptstyle \textrm +)Y\|^2. We describe two simple and deterministic algorithms. The first, receives a parameter ∆and guarantees that \|X - (XY^\scriptstyle \textrm +)Y\|^2 is not significantly larger than ∆. It requires a target dimension of \ell = O(k/ε) for any k,εsuch that ∆\ge ε\sigma_1^2 + \sigma_k+1^2, with \sigma_i being the i’th singular value of X. The second receives k and εand guarantees that \|X - (XY^\scriptstyle \textrm +)Y\|^2 \le ε\sigma_1^2 + \sigma_k+1^2. It requires a target dimension of O( k\log n/ε^2). Different models and algorithms for Online PCA were considered in the past. This is the first that achieves a bound on the spectral norm of the residual matrix. } }
Endnote
%0 Conference Paper %T Online PCA with Spectral Bounds %A Zohar Karnin %A Edo Liberty %B Proceedings of The 28th Conference on Learning Theory %C Proceedings of Machine Learning Research %D 2015 %E Peter Grünwald %E Elad Hazan %E Satyen Kale %F pmlr-v40-Karnin15 %I PMLR %P 1129--1140 %U https://proceedings.mlr.press/v40/Karnin15.html %V 40 %X This paper revisits the online PCA problem. Given a stream of n vectors x_t ∈\mathbbR^d (columns of X) the algorithm must output y_t ∈\mathbbR^\ell (columns of Y) before receiving x_t+1. The goal of online PCA is to simultaneously minimize the target dimension \ell and the error \|X - (XY^\scriptstyle \textrm +)Y\|^2. We describe two simple and deterministic algorithms. The first, receives a parameter ∆and guarantees that \|X - (XY^\scriptstyle \textrm +)Y\|^2 is not significantly larger than ∆. It requires a target dimension of \ell = O(k/ε) for any k,εsuch that ∆\ge ε\sigma_1^2 + \sigma_k+1^2, with \sigma_i being the i’th singular value of X. The second receives k and εand guarantees that \|X - (XY^\scriptstyle \textrm +)Y\|^2 \le ε\sigma_1^2 + \sigma_k+1^2. It requires a target dimension of O( k\log n/ε^2). Different models and algorithms for Online PCA were considered in the past. This is the first that achieves a bound on the spectral norm of the residual matrix.
RIS
TY - CPAPER TI - Online PCA with Spectral Bounds AU - Zohar Karnin AU - Edo Liberty BT - Proceedings of The 28th Conference on Learning Theory DA - 2015/06/26 ED - Peter Grünwald ED - Elad Hazan ED - Satyen Kale ID - pmlr-v40-Karnin15 PB - PMLR DP - Proceedings of Machine Learning Research VL - 40 SP - 1129 EP - 1140 L1 - http://proceedings.mlr.press/v40/Karnin15.pdf UR - https://proceedings.mlr.press/v40/Karnin15.html AB - This paper revisits the online PCA problem. Given a stream of n vectors x_t ∈\mathbbR^d (columns of X) the algorithm must output y_t ∈\mathbbR^\ell (columns of Y) before receiving x_t+1. The goal of online PCA is to simultaneously minimize the target dimension \ell and the error \|X - (XY^\scriptstyle \textrm +)Y\|^2. We describe two simple and deterministic algorithms. The first, receives a parameter ∆and guarantees that \|X - (XY^\scriptstyle \textrm +)Y\|^2 is not significantly larger than ∆. It requires a target dimension of \ell = O(k/ε) for any k,εsuch that ∆\ge ε\sigma_1^2 + \sigma_k+1^2, with \sigma_i being the i’th singular value of X. The second receives k and εand guarantees that \|X - (XY^\scriptstyle \textrm +)Y\|^2 \le ε\sigma_1^2 + \sigma_k+1^2. It requires a target dimension of O( k\log n/ε^2). Different models and algorithms for Online PCA were considered in the past. This is the first that achieves a bound on the spectral norm of the residual matrix. ER -
APA
Karnin, Z. & Liberty, E.. (2015). Online PCA with Spectral Bounds. Proceedings of The 28th Conference on Learning Theory, in Proceedings of Machine Learning Research 40:1129-1140 Available from https://proceedings.mlr.press/v40/Karnin15.html.

Related Material