Best Agglomerative Ranked Subset for Feature Selection
Roberto Ruiz, José C. Riquelme, Jesús S. Aguilar-Ruiz;
JMLR W&P 4:148-162, 2008.
Abstract
The enormous increase of the size in databases makes finding an optimal subset
of features extremely difficult. In this paper, a new feature selection method
is proposed that will allow any subset evaluator -including the wrapper
evaluation method- to be used to find a group of features that will allow a
distinction to be made between the different possible classes. The method,
BARS (Best Agglomerative Ranked Subset), is based on the idea of relevance and
redundancy, in the sense that a ranked feature (or set) is more relevant if it
adds information when it is included in the final subset of selected features.
This heuristic method reduces dimensionality drastically and leads to improvements
in the accuracy, in comparison to a complete set and as opposed to other feature
selection algorithms.