Unsupervised Variable Selection: when random rankings sound as irrelevancy
Sébastien Guérif;
JMLR W&P 4:163-177, 2008.
Abstract
Whereas the variable selection has been extensively studied in the context of
supervised learning, the unsupervised variable selection has attracted attention
of researchers more recently as the available amount of unlabeled data has exploded.
Many unsupervised variable ranking criteria were proposed and their relevance is usually
demonstrated using either external cluster validity indexes or the accuracy of a classifier
which are both supervised criteria. Actually, the major issue of the variable subset selection
according to a ranking measure has been adressed only by few authors in the unsupervised
learning context. In this paper, we propose to combine multiple ranking to go ahead toward
a stable consensus variable subset in a totally unsupervised fashion.