Active Learning from Multiple Knowledge Sources

Yan Yan; Romer Rosales; Glenn Fung; Faisal Farooq; Bharat Rao; Jennifer Dy

Active Learning from Multiple Knowledge Sources

Yan Yan, Romer Rosales, Glenn Fung, Faisal Farooq, Bharat Rao, Jennifer Dy

Proceedings of the Fifteenth International Conference on Artificial Intelligence and Statistics, PMLR 22:1350-1357, 2012.

Abstract

Some supervised learning tasks do not fit the usual single annotator scenario. In these problems, ground-truth may not exist and multiple annotators are generally available. A few approaches have been proposed to address this learning problem. In this setting active learning (AL), the problem of optimally selecting unlabeled samples for labeling, offers new challenges and has received little attention. In multiple annotator AL, it is not sufficient to select a sample for labeling since, in addition, an optimal annotator must also be selected. This setting is of great interest as annotators’ expertise generally varies and could depend on the given sample itself; additionally, some annotators may be adversarial. Thus, clearly the information provided by some annotators should be more valuable than that provided by others and it could vary across data points. We propose an AL approach for this new scenario motivated by information theoretic principles. Specifically, we focus on maximizing the information that an annotator label provides about the true (but unknown) label of the data point. We develop this concept, propose an algorithm for active learning, and experimentally validate the proposed approach.

Cite this Paper

BibTeX


@InProceedings{pmlr-v22-yan12,
  title = 	 {Active Learning from Multiple Knowledge Sources},
  author = 	 {Yan, Yan and Rosales, Romer and Fung, Glenn and Farooq, Faisal and Rao, Bharat and Dy, Jennifer},
  booktitle = 	 {Proceedings of the Fifteenth International Conference on Artificial Intelligence and Statistics},
  pages = 	 {1350--1357},
  year = 	 {2012},
  editor = 	 {Lawrence, Neil D. and Girolami, Mark},
  volume = 	 {22},
  series = 	 {Proceedings of Machine Learning Research},
  address = 	 {La Palma, Canary Islands},
  month = 	 {21--23 Apr},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v22/yan12/yan12.pdf},
  url = 	 {https://proceedings.mlr.press/v22/yan12.html},
  abstract = 	 {Some supervised learning tasks do not fit the usual single annotator scenario. In these problems, ground-truth may not exist and multiple annotators are generally available. A few approaches have been proposed to address this learning problem. In this setting active learning (AL), the problem of optimally selecting unlabeled samples for labeling, offers new challenges and has received little attention. In multiple annotator AL, it is not sufficient to select a sample for labeling since, in addition, an optimal annotator must also be selected. This setting is of great interest as annotators’ expertise generally varies and could depend on the given sample itself; additionally, some annotators may be adversarial. Thus, clearly the information provided by some annotators should be more valuable than that provided by others and it could vary across data points. We propose an AL approach for this new scenario motivated by information theoretic principles. Specifically, we focus on maximizing the information that an annotator label provides about the true (but unknown) label of the data point. We develop this concept, propose an algorithm for active learning, and experimentally validate the proposed approach.}
}

Endnote

%0 Conference Paper
%T Active Learning from Multiple Knowledge Sources
%A Yan Yan
%A Romer Rosales
%A Glenn Fung
%A Faisal Farooq
%A Bharat Rao
%A Jennifer Dy
%B Proceedings of the Fifteenth International Conference on Artificial Intelligence and Statistics
%C Proceedings of Machine Learning Research
%D 2012
%E Neil D. Lawrence
%E Mark Girolami	
%F pmlr-v22-yan12
%I PMLR
%P 1350--1357
%U https://proceedings.mlr.press/v22/yan12.html
%V 22
%X Some supervised learning tasks do not fit the usual single annotator scenario. In these problems, ground-truth may not exist and multiple annotators are generally available. A few approaches have been proposed to address this learning problem. In this setting active learning (AL), the problem of optimally selecting unlabeled samples for labeling, offers new challenges and has received little attention. In multiple annotator AL, it is not sufficient to select a sample for labeling since, in addition, an optimal annotator must also be selected. This setting is of great interest as annotators’ expertise generally varies and could depend on the given sample itself; additionally, some annotators may be adversarial. Thus, clearly the information provided by some annotators should be more valuable than that provided by others and it could vary across data points. We propose an AL approach for this new scenario motivated by information theoretic principles. Specifically, we focus on maximizing the information that an annotator label provides about the true (but unknown) label of the data point. We develop this concept, propose an algorithm for active learning, and experimentally validate the proposed approach.

RIS


TY  - CPAPER
TI  - Active Learning from Multiple Knowledge Sources
AU  - Yan Yan
AU  - Romer Rosales
AU  - Glenn Fung
AU  - Faisal Farooq
AU  - Bharat Rao
AU  - Jennifer Dy
BT  - Proceedings of the Fifteenth International Conference on Artificial Intelligence and Statistics
DA  - 2012/03/21
ED  - Neil D. Lawrence
ED  - Mark Girolami	
ID  - pmlr-v22-yan12
PB  - PMLR
DP  - Proceedings of Machine Learning Research
VL  - 22
SP  - 1350
EP  - 1357
L1  - http://proceedings.mlr.press/v22/yan12/yan12.pdf
UR  - https://proceedings.mlr.press/v22/yan12.html
AB  - Some supervised learning tasks do not fit the usual single annotator scenario. In these problems, ground-truth may not exist and multiple annotators are generally available. A few approaches have been proposed to address this learning problem. In this setting active learning (AL), the problem of optimally selecting unlabeled samples for labeling, offers new challenges and has received little attention. In multiple annotator AL, it is not sufficient to select a sample for labeling since, in addition, an optimal annotator must also be selected. This setting is of great interest as annotators’ expertise generally varies and could depend on the given sample itself; additionally, some annotators may be adversarial. Thus, clearly the information provided by some annotators should be more valuable than that provided by others and it could vary across data points. We propose an AL approach for this new scenario motivated by information theoretic principles. Specifically, we focus on maximizing the information that an annotator label provides about the true (but unknown) label of the data point. We develop this concept, propose an algorithm for active learning, and experimentally validate the proposed approach.
ER  -

APA


Yan, Y., Rosales, R., Fung, G., Farooq, F., Rao, B. & Dy, J.. (2012). Active Learning from Multiple Knowledge Sources. Proceedings of the Fifteenth International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 22:1350-1357 Available from https://proceedings.mlr.press/v22/yan12.html.

Related Material

Download PDF