Large-Scale Bandit Problems and KWIK Learning

Jacob Abernethy, Kareem Amin, Michael Kearns, Moez Draief
Proceedings of the 30th International Conference on Machine Learning, PMLR 28(1):588-596, 2013.

Abstract

We show that parametric multi-armed bandit (MAB) problems with large state and action spaces can be algorithmically reduced to the supervised learning model known as Knows What It Knows or KWIK learning. We give matching impossibility results showing that the KWIK learnability requirement cannot be replaced by weaker supervised learning assumptions. We provide such results in both the standard parametric MAB setting, as well as for a new model in which the action space is finite but growing with time.

Cite this Paper


BibTeX
@InProceedings{pmlr-v28-abernethy13, title = {Large-Scale Bandit Problems and {KWIK} Learning}, author = {Abernethy, Jacob and Amin, Kareem and Kearns, Michael and Draief, Moez}, booktitle = {Proceedings of the 30th International Conference on Machine Learning}, pages = {588--596}, year = {2013}, editor = {Dasgupta, Sanjoy and McAllester, David}, volume = {28}, number = {1}, series = {Proceedings of Machine Learning Research}, address = {Atlanta, Georgia, USA}, month = {17--19 Jun}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v28/abernethy13.pdf}, url = {https://proceedings.mlr.press/v28/abernethy13.html}, abstract = {We show that parametric multi-armed bandit (MAB) problems with large state and action spaces can be algorithmically reduced to the supervised learning model known as Knows What It Knows or KWIK learning. We give matching impossibility results showing that the KWIK learnability requirement cannot be replaced by weaker supervised learning assumptions. We provide such results in both the standard parametric MAB setting, as well as for a new model in which the action space is finite but growing with time.} }
Endnote
%0 Conference Paper %T Large-Scale Bandit Problems and KWIK Learning %A Jacob Abernethy %A Kareem Amin %A Michael Kearns %A Moez Draief %B Proceedings of the 30th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2013 %E Sanjoy Dasgupta %E David McAllester %F pmlr-v28-abernethy13 %I PMLR %P 588--596 %U https://proceedings.mlr.press/v28/abernethy13.html %V 28 %N 1 %X We show that parametric multi-armed bandit (MAB) problems with large state and action spaces can be algorithmically reduced to the supervised learning model known as Knows What It Knows or KWIK learning. We give matching impossibility results showing that the KWIK learnability requirement cannot be replaced by weaker supervised learning assumptions. We provide such results in both the standard parametric MAB setting, as well as for a new model in which the action space is finite but growing with time.
RIS
TY - CPAPER TI - Large-Scale Bandit Problems and KWIK Learning AU - Jacob Abernethy AU - Kareem Amin AU - Michael Kearns AU - Moez Draief BT - Proceedings of the 30th International Conference on Machine Learning DA - 2013/02/13 ED - Sanjoy Dasgupta ED - David McAllester ID - pmlr-v28-abernethy13 PB - PMLR DP - Proceedings of Machine Learning Research VL - 28 IS - 1 SP - 588 EP - 596 L1 - http://proceedings.mlr.press/v28/abernethy13.pdf UR - https://proceedings.mlr.press/v28/abernethy13.html AB - We show that parametric multi-armed bandit (MAB) problems with large state and action spaces can be algorithmically reduced to the supervised learning model known as Knows What It Knows or KWIK learning. We give matching impossibility results showing that the KWIK learnability requirement cannot be replaced by weaker supervised learning assumptions. We provide such results in both the standard parametric MAB setting, as well as for a new model in which the action space is finite but growing with time. ER -
APA
Abernethy, J., Amin, K., Kearns, M. & Draief, M.. (2013). Large-Scale Bandit Problems and KWIK Learning. Proceedings of the 30th International Conference on Machine Learning, in Proceedings of Machine Learning Research 28(1):588-596 Available from https://proceedings.mlr.press/v28/abernethy13.html.

Related Material