Doubly Aggressive Selective Sampling Algorithms for Classification

Koby Crammer
Proceedings of the Seventeenth International Conference on Artificial Intelligence and Statistics, PMLR 33:140-148, 2014.

Abstract

Online selective sampling algorithms learn to perform binary classification, and additionally they decided whether to ask, or query, for a label of any given example. We introduce two stochastic linear algorithms and analyze them in the worst-case mistake-bound framework. Even though stochastic, for some inputs, our algorithms query with probability 1 and make an update even if there is no mistake, yet the margin is small, hence they are doubly aggressive. We prove bounds in the worst-case settings, which may be lower than previous bounds in some settings. Experiments with 33 document classification datasets, some with 100Ks examples, show the superiority of doubly-aggressive algorithms both in performance and number of queries.

Cite this Paper


BibTeX
@InProceedings{pmlr-v33-crammer14, title = {{Doubly Aggressive Selective Sampling Algorithms for Classification}}, author = {Crammer, Koby}, booktitle = {Proceedings of the Seventeenth International Conference on Artificial Intelligence and Statistics}, pages = {140--148}, year = {2014}, editor = {Kaski, Samuel and Corander, Jukka}, volume = {33}, series = {Proceedings of Machine Learning Research}, address = {Reykjavik, Iceland}, month = {22--25 Apr}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v33/crammer14.pdf}, url = {https://proceedings.mlr.press/v33/crammer14.html}, abstract = {Online selective sampling algorithms learn to perform binary classification, and additionally they decided whether to ask, or query, for a label of any given example. We introduce two stochastic linear algorithms and analyze them in the worst-case mistake-bound framework. Even though stochastic, for some inputs, our algorithms query with probability 1 and make an update even if there is no mistake, yet the margin is small, hence they are doubly aggressive. We prove bounds in the worst-case settings, which may be lower than previous bounds in some settings. Experiments with 33 document classification datasets, some with 100Ks examples, show the superiority of doubly-aggressive algorithms both in performance and number of queries.} }
Endnote
%0 Conference Paper %T Doubly Aggressive Selective Sampling Algorithms for Classification %A Koby Crammer %B Proceedings of the Seventeenth International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2014 %E Samuel Kaski %E Jukka Corander %F pmlr-v33-crammer14 %I PMLR %P 140--148 %U https://proceedings.mlr.press/v33/crammer14.html %V 33 %X Online selective sampling algorithms learn to perform binary classification, and additionally they decided whether to ask, or query, for a label of any given example. We introduce two stochastic linear algorithms and analyze them in the worst-case mistake-bound framework. Even though stochastic, for some inputs, our algorithms query with probability 1 and make an update even if there is no mistake, yet the margin is small, hence they are doubly aggressive. We prove bounds in the worst-case settings, which may be lower than previous bounds in some settings. Experiments with 33 document classification datasets, some with 100Ks examples, show the superiority of doubly-aggressive algorithms both in performance and number of queries.
RIS
TY - CPAPER TI - Doubly Aggressive Selective Sampling Algorithms for Classification AU - Koby Crammer BT - Proceedings of the Seventeenth International Conference on Artificial Intelligence and Statistics DA - 2014/04/02 ED - Samuel Kaski ED - Jukka Corander ID - pmlr-v33-crammer14 PB - PMLR DP - Proceedings of Machine Learning Research VL - 33 SP - 140 EP - 148 L1 - http://proceedings.mlr.press/v33/crammer14.pdf UR - https://proceedings.mlr.press/v33/crammer14.html AB - Online selective sampling algorithms learn to perform binary classification, and additionally they decided whether to ask, or query, for a label of any given example. We introduce two stochastic linear algorithms and analyze them in the worst-case mistake-bound framework. Even though stochastic, for some inputs, our algorithms query with probability 1 and make an update even if there is no mistake, yet the margin is small, hence they are doubly aggressive. We prove bounds in the worst-case settings, which may be lower than previous bounds in some settings. Experiments with 33 document classification datasets, some with 100Ks examples, show the superiority of doubly-aggressive algorithms both in performance and number of queries. ER -
APA
Crammer, K.. (2014). Doubly Aggressive Selective Sampling Algorithms for Classification. Proceedings of the Seventeenth International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 33:140-148 Available from https://proceedings.mlr.press/v33/crammer14.html.

Related Material