## Similarity-based Clustering by Left-Stochastic Matrix Factorization

*Raman Arora, Maya R. Gupta, Amol Kapila, Maryam Fazel*; 14(Jul):1715−1746, 2013.

### Abstract

For similarity-based clustering, we propose modeling the entries
of a given similarity matrix as the inner products of the
unknown cluster probabilities. To estimate the cluster
probabilities from the given similarity matrix, we introduce a
left-stochastic non-negative matrix factorization problem. A
rotation-based algorithm is proposed for the matrix
factorization. Conditions for unique matrix factorizations and
clusterings are given, and an error bound is provided. The
algorithm is particularly efficient for the case of two
clusters, which motivates a hierarchical variant for cases where
the number of desired clusters is large. Experiments show that
the proposed left-stochastic decomposition clustering model
produces relatively high within-cluster similarity on most data
sets and can match given class labels, and that the efficient
hierarchical variant performs surprisingly well.

[abs][pdf][bib]