Home Page

Papers

Submissions

News

Scope

Editorial Board

Announcements

Proceedings

Open Source Software

Search

Login



RSS Feed

Clusterability: A Theoretical Study

Margareta Ackerman, Shai Ben-David; JMLR W&CP 5:1-8, 2009.

Abstract

We investigate measures of the clusterability of data sets. Namely, ways to define how `strong' or `conclusive' is the clustering structure of a given data set. We address this issue with generality, aiming for conclusions that apply regardless of any particular clustering algorithm or any specific data generation model. We survey several notions of clusterability that have been discussed in the literature, as well as propose a new notion of data clusterability. Our comparison of these notions reveals that, although they all attempt to evaluate the same intuitive property, they are pairwise inconsistent. Our analysis discovers an interesting phenomenon; Although most of the common clustering tasks are NP-hard, finding a close-to-optimal clustering for well-clusterable data sets is easy (computationally). We prove instances of this general claim with respect to the various clusterability notions that we discuss. Finally, we investigate how hard it is to determine the clusterability value of a given data set. In most cases, it turns out that this is an NP-hard problem.



Home Page

Papers

Submissions

News

Scope

Editorial Board

Announcements

Proceedings

Open Source Software

Search

Login



RSS Feed

Page last modified on Fri Apr 3 20:30:46 BST 2009.

Copyright @ JMLR 2000. All rights reserved.