Multiarmed Bandits With Limited Expert Advice

Satyen Kale
Proceedings of The 27th Conference on Learning Theory, PMLR 35:107-122, 2014.

Abstract

We consider the problem of minimizing regret in the setting of advice-efficient multiarmed bandits with expert advice. We give an algorithm for the setting of K arms and N experts out of which we are allowed to query and use only M experts’ advice in each round, which has a regret bound of \tildeO\left(\sqrt\frac\min{K, M} NM T\right) after T rounds. We also prove that any algorithm for this problem must have expected regret at least \tildeΩ\left(\sqrt\frac\min{K, M} NMT\right), thus showing that our upper bound is nearly tight. This solves the COLT 2013 open problem of Seldin et al. (2013).

Cite this Paper


BibTeX
@InProceedings{pmlr-v35-kale14a, title = {Multiarmed Bandits With Limited Expert Advice}, author = {Kale, Satyen}, booktitle = {Proceedings of The 27th Conference on Learning Theory}, pages = {107--122}, year = {2014}, editor = {Balcan, Maria Florina and Feldman, Vitaly and Szepesvári, Csaba}, volume = {35}, series = {Proceedings of Machine Learning Research}, address = {Barcelona, Spain}, month = {13--15 Jun}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v35/kale14a.pdf}, url = {https://proceedings.mlr.press/v35/kale14a.html}, abstract = {We consider the problem of minimizing regret in the setting of advice-efficient multiarmed bandits with expert advice. We give an algorithm for the setting of K arms and N experts out of which we are allowed to query and use only M experts’ advice in each round, which has a regret bound of \tildeO\left(\sqrt\frac\min{K, M} NM T\right) after T rounds. We also prove that any algorithm for this problem must have expected regret at least \tildeΩ\left(\sqrt\frac\min{K, M} NMT\right), thus showing that our upper bound is nearly tight. This solves the COLT 2013 open problem of Seldin et al. (2013).} }
Endnote
%0 Conference Paper %T Multiarmed Bandits With Limited Expert Advice %A Satyen Kale %B Proceedings of The 27th Conference on Learning Theory %C Proceedings of Machine Learning Research %D 2014 %E Maria Florina Balcan %E Vitaly Feldman %E Csaba Szepesvári %F pmlr-v35-kale14a %I PMLR %P 107--122 %U https://proceedings.mlr.press/v35/kale14a.html %V 35 %X We consider the problem of minimizing regret in the setting of advice-efficient multiarmed bandits with expert advice. We give an algorithm for the setting of K arms and N experts out of which we are allowed to query and use only M experts’ advice in each round, which has a regret bound of \tildeO\left(\sqrt\frac\min{K, M} NM T\right) after T rounds. We also prove that any algorithm for this problem must have expected regret at least \tildeΩ\left(\sqrt\frac\min{K, M} NMT\right), thus showing that our upper bound is nearly tight. This solves the COLT 2013 open problem of Seldin et al. (2013).
RIS
TY - CPAPER TI - Multiarmed Bandits With Limited Expert Advice AU - Satyen Kale BT - Proceedings of The 27th Conference on Learning Theory DA - 2014/05/29 ED - Maria Florina Balcan ED - Vitaly Feldman ED - Csaba Szepesvári ID - pmlr-v35-kale14a PB - PMLR DP - Proceedings of Machine Learning Research VL - 35 SP - 107 EP - 122 L1 - http://proceedings.mlr.press/v35/kale14a.pdf UR - https://proceedings.mlr.press/v35/kale14a.html AB - We consider the problem of minimizing regret in the setting of advice-efficient multiarmed bandits with expert advice. We give an algorithm for the setting of K arms and N experts out of which we are allowed to query and use only M experts’ advice in each round, which has a regret bound of \tildeO\left(\sqrt\frac\min{K, M} NM T\right) after T rounds. We also prove that any algorithm for this problem must have expected regret at least \tildeΩ\left(\sqrt\frac\min{K, M} NMT\right), thus showing that our upper bound is nearly tight. This solves the COLT 2013 open problem of Seldin et al. (2013). ER -
APA
Kale, S.. (2014). Multiarmed Bandits With Limited Expert Advice. Proceedings of The 27th Conference on Learning Theory, in Proceedings of Machine Learning Research 35:107-122 Available from https://proceedings.mlr.press/v35/kale14a.html.

Related Material