Active Learning and Experimental Design with SVMs

Chia-Hua Ho, Ming-Hen Tsai, Chih-Jen Lin
Active Learning and Experimental Design workshop In conjunction with AISTATS 2010, PMLR 16:71-84, 2011.

Abstract

In this paper, we consider active learning as a procedure of iteratively performing two steps: first, we train a classifier based on labeled and unlabeled data. Second, we query labels of some data points. The first part is achieved mainly by standard classifiers such as SVM and logistic regression. We develop additional techniques when there are very few labeled data. These techniques help to obtain good classifiers in the early stage of the active learning procedure. In the second part, based on SVM or logistic regression decision values, we propose a framework to flexibly select points for query. We find that selecting points with various distances to the decision boundary is important, but including more points close to the decision boundary further improves the performance. Our experiments are conducted on the data sets of Causality Active Learning Challenge. With measurements of Area Under Curve (AUC) and Area under the Learning Curve (ALC), we find suitable methods for different data sets.

Cite this Paper


BibTeX
@InProceedings{pmlr-v16-ho11a, title = {Active Learning and Experimental Design with SVMs}, author = {Ho, Chia-Hua and Tsai, Ming-Hen and Lin, Chih-Jen}, booktitle = {Active Learning and Experimental Design workshop In conjunction with AISTATS 2010}, pages = {71--84}, year = {2011}, editor = {Guyon, Isabelle and Cawley, Gavin and Dror, Gideon and Lemaire, Vincent and Statnikov, Alexander}, volume = {16}, series = {Proceedings of Machine Learning Research}, address = {Sardinia, Italy}, month = {16 May}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v16/ho11a/ho11a.pdf}, url = {https://proceedings.mlr.press/v16/ho11a.html}, abstract = {In this paper, we consider active learning as a procedure of iteratively performing two steps: first, we train a classifier based on labeled and unlabeled data. Second, we query labels of some data points. The first part is achieved mainly by standard classifiers such as SVM and logistic regression. We develop additional techniques when there are very few labeled data. These techniques help to obtain good classifiers in the early stage of the active learning procedure. In the second part, based on SVM or logistic regression decision values, we propose a framework to flexibly select points for query. We find that selecting points with various distances to the decision boundary is important, but including more points close to the decision boundary further improves the performance. Our experiments are conducted on the data sets of Causality Active Learning Challenge. With measurements of Area Under Curve (AUC) and Area under the Learning Curve (ALC), we find suitable methods for different data sets.} }
Endnote
%0 Conference Paper %T Active Learning and Experimental Design with SVMs %A Chia-Hua Ho %A Ming-Hen Tsai %A Chih-Jen Lin %B Active Learning and Experimental Design workshop In conjunction with AISTATS 2010 %C Proceedings of Machine Learning Research %D 2011 %E Isabelle Guyon %E Gavin Cawley %E Gideon Dror %E Vincent Lemaire %E Alexander Statnikov %F pmlr-v16-ho11a %I PMLR %P 71--84 %U https://proceedings.mlr.press/v16/ho11a.html %V 16 %X In this paper, we consider active learning as a procedure of iteratively performing two steps: first, we train a classifier based on labeled and unlabeled data. Second, we query labels of some data points. The first part is achieved mainly by standard classifiers such as SVM and logistic regression. We develop additional techniques when there are very few labeled data. These techniques help to obtain good classifiers in the early stage of the active learning procedure. In the second part, based on SVM or logistic regression decision values, we propose a framework to flexibly select points for query. We find that selecting points with various distances to the decision boundary is important, but including more points close to the decision boundary further improves the performance. Our experiments are conducted on the data sets of Causality Active Learning Challenge. With measurements of Area Under Curve (AUC) and Area under the Learning Curve (ALC), we find suitable methods for different data sets.
RIS
TY - CPAPER TI - Active Learning and Experimental Design with SVMs AU - Chia-Hua Ho AU - Ming-Hen Tsai AU - Chih-Jen Lin BT - Active Learning and Experimental Design workshop In conjunction with AISTATS 2010 DA - 2011/04/21 ED - Isabelle Guyon ED - Gavin Cawley ED - Gideon Dror ED - Vincent Lemaire ED - Alexander Statnikov ID - pmlr-v16-ho11a PB - PMLR DP - Proceedings of Machine Learning Research VL - 16 SP - 71 EP - 84 L1 - http://proceedings.mlr.press/v16/ho11a/ho11a.pdf UR - https://proceedings.mlr.press/v16/ho11a.html AB - In this paper, we consider active learning as a procedure of iteratively performing two steps: first, we train a classifier based on labeled and unlabeled data. Second, we query labels of some data points. The first part is achieved mainly by standard classifiers such as SVM and logistic regression. We develop additional techniques when there are very few labeled data. These techniques help to obtain good classifiers in the early stage of the active learning procedure. In the second part, based on SVM or logistic regression decision values, we propose a framework to flexibly select points for query. We find that selecting points with various distances to the decision boundary is important, but including more points close to the decision boundary further improves the performance. Our experiments are conducted on the data sets of Causality Active Learning Challenge. With measurements of Area Under Curve (AUC) and Area under the Learning Curve (ALC), we find suitable methods for different data sets. ER -
APA
Ho, C., Tsai, M. & Lin, C.. (2011). Active Learning and Experimental Design with SVMs. Active Learning and Experimental Design workshop In conjunction with AISTATS 2010, in Proceedings of Machine Learning Research 16:71-84 Available from https://proceedings.mlr.press/v16/ho11a.html.

Related Material