Increasing Feature Selection Accuracy for L1 Regularized Linear Models

Abhishek Jaiantilal, Gregory Grudic
Proceedings of the Fourth International Workshop on Feature Selection in Data Mining, PMLR 10:86-96, 2010.

Abstract

L1 (also referred to as the 1-norm or Lasso) penalty based formulations have been shown to be effective in problem domains when noisy features are present. However, the L1 penalty does not give favorable asymptotic properties with respect to feature selection, and has been shown to be inconsistent as a feature selection estimator; e.g. when noisy features are correlated with the relevant features. This can affect the estimation of the correct feature set, in certain domains like robotics, when both the number of examples and the number of features are large. The weighted lasso penalty by (Zou, 2006) has been proposed to rectify this problem of correct estimation of the feature set. This paper proposes a novel method for identifying problem specific L1 feature weights by utilizing the results from (Zou, 2006) and (Rocha et al., 2009) and is applicable to regression and classification algorithms. Our method increases the accuracy of L1 penalized algorithms through randomized experiments on subsets of the training data as a fast pre-processing step. We show experimental and theoretical results supporting the efficacy of the proposed method on two L1 penalized classification algorithms.

Cite this Paper


BibTeX
@InProceedings{pmlr-v10-jaiantilal10a, title = {Increasing Feature Selection Accuracy for L1 Regularized Linear Models}, author = {Jaiantilal, Abhishek and Grudic, Gregory}, booktitle = {Proceedings of the Fourth International Workshop on Feature Selection in Data Mining}, pages = {86--96}, year = {2010}, editor = {Liu, Huan and Motoda, Hiroshi and Setiono, Rudy and Zhao, Zheng}, volume = {10}, series = {Proceedings of Machine Learning Research}, address = {Hyderabad, India}, month = {21 Jun}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v10/jaiantilal10a/jaiantilal10a.pdf}, url = {https://proceedings.mlr.press/v10/jaiantilal10a.html}, abstract = {L1 (also referred to as the 1-norm or Lasso) penalty based formulations have been shown to be effective in problem domains when noisy features are present. However, the L1 penalty does not give favorable asymptotic properties with respect to feature selection, and has been shown to be inconsistent as a feature selection estimator; e.g. when noisy features are correlated with the relevant features. This can affect the estimation of the correct feature set, in certain domains like robotics, when both the number of examples and the number of features are large. The weighted lasso penalty by (Zou, 2006) has been proposed to rectify this problem of correct estimation of the feature set. This paper proposes a novel method for identifying problem specific L1 feature weights by utilizing the results from (Zou, 2006) and (Rocha et al., 2009) and is applicable to regression and classification algorithms. Our method increases the accuracy of L1 penalized algorithms through randomized experiments on subsets of the training data as a fast pre-processing step. We show experimental and theoretical results supporting the efficacy of the proposed method on two L1 penalized classification algorithms.} }
Endnote
%0 Conference Paper %T Increasing Feature Selection Accuracy for L1 Regularized Linear Models %A Abhishek Jaiantilal %A Gregory Grudic %B Proceedings of the Fourth International Workshop on Feature Selection in Data Mining %C Proceedings of Machine Learning Research %D 2010 %E Huan Liu %E Hiroshi Motoda %E Rudy Setiono %E Zheng Zhao %F pmlr-v10-jaiantilal10a %I PMLR %P 86--96 %U https://proceedings.mlr.press/v10/jaiantilal10a.html %V 10 %X L1 (also referred to as the 1-norm or Lasso) penalty based formulations have been shown to be effective in problem domains when noisy features are present. However, the L1 penalty does not give favorable asymptotic properties with respect to feature selection, and has been shown to be inconsistent as a feature selection estimator; e.g. when noisy features are correlated with the relevant features. This can affect the estimation of the correct feature set, in certain domains like robotics, when both the number of examples and the number of features are large. The weighted lasso penalty by (Zou, 2006) has been proposed to rectify this problem of correct estimation of the feature set. This paper proposes a novel method for identifying problem specific L1 feature weights by utilizing the results from (Zou, 2006) and (Rocha et al., 2009) and is applicable to regression and classification algorithms. Our method increases the accuracy of L1 penalized algorithms through randomized experiments on subsets of the training data as a fast pre-processing step. We show experimental and theoretical results supporting the efficacy of the proposed method on two L1 penalized classification algorithms.
RIS
TY - CPAPER TI - Increasing Feature Selection Accuracy for L1 Regularized Linear Models AU - Abhishek Jaiantilal AU - Gregory Grudic BT - Proceedings of the Fourth International Workshop on Feature Selection in Data Mining DA - 2010/05/26 ED - Huan Liu ED - Hiroshi Motoda ED - Rudy Setiono ED - Zheng Zhao ID - pmlr-v10-jaiantilal10a PB - PMLR DP - Proceedings of Machine Learning Research VL - 10 SP - 86 EP - 96 L1 - http://proceedings.mlr.press/v10/jaiantilal10a/jaiantilal10a.pdf UR - https://proceedings.mlr.press/v10/jaiantilal10a.html AB - L1 (also referred to as the 1-norm or Lasso) penalty based formulations have been shown to be effective in problem domains when noisy features are present. However, the L1 penalty does not give favorable asymptotic properties with respect to feature selection, and has been shown to be inconsistent as a feature selection estimator; e.g. when noisy features are correlated with the relevant features. This can affect the estimation of the correct feature set, in certain domains like robotics, when both the number of examples and the number of features are large. The weighted lasso penalty by (Zou, 2006) has been proposed to rectify this problem of correct estimation of the feature set. This paper proposes a novel method for identifying problem specific L1 feature weights by utilizing the results from (Zou, 2006) and (Rocha et al., 2009) and is applicable to regression and classification algorithms. Our method increases the accuracy of L1 penalized algorithms through randomized experiments on subsets of the training data as a fast pre-processing step. We show experimental and theoretical results supporting the efficacy of the proposed method on two L1 penalized classification algorithms. ER -
APA
Jaiantilal, A. & Grudic, G.. (2010). Increasing Feature Selection Accuracy for L1 Regularized Linear Models. Proceedings of the Fourth International Workshop on Feature Selection in Data Mining, in Proceedings of Machine Learning Research 10:86-96 Available from https://proceedings.mlr.press/v10/jaiantilal10a.html.

Related Material