Exploiting tree-based variable importances to selectively identify relevant variables

Vân Anh Huynh-Thu, Louis Wehenkel, Pierre Geurts; JMLR W&P 4:60-73, 2008.

Abstract

This paper proposes a novel statistical procedure based on permutation tests for extracting a subset of truly relevant variables from multivariate importance rankings derived from tree-based supervised learning methods. It shows also that the direct extension of the classical approach based on permutation tests for estimating false discovery rates of univariate variable scoring procedures does not extend very well to the case of multivariate tree-based importance measures.



Home Page

Papers

Submissions

News

Scope

Editorial Board

Announcements

Proceedings

Open Source Software

Search

Login



RSS Feed