Open Problem: Best Arm Identification: Almost Instance-Wise Optimality and the Gap Entropy Conjecture

Lijie Chen, Jian Li
29th Annual Conference on Learning Theory, PMLR 49:1643-1646, 2016.

Abstract

The best arm identification problem (BEST-1-ARM) is the most basic pure exploration problem in stochastic multi-armed bandits. The problem has a long history and attracted significant attention for the last decade. However, we do not yet have a complete understanding of the optimal sample complexity of the problem: The state-of-the-art algorithms achieve a sample complexity of O(\sum_i=2^n \Delta_i^-2(\lnδ^-1 + \ln\ln\Delta_i^-1)) (\Delta_i is the difference between the largest mean and the i^th mean), while the best known lower bound is Ω(\sum_i=2^n \Delta_i^-2\lnδ^-1) for general instances and Ω(∆^-2 \ln\ln ∆^-1) for the two-arm instances. We propose to study the instance-wise optimality for the BEST-1-ARM problem. Previous work has proved that it is impossible to have an instance optimal algorithm for the 2-arm problem. However, we conjecture that modulo the additive term Ω(\Delta_2^-2 \ln\ln \Delta_2^-1) (which is an upper bound and worst case lower bound for the 2-arm problem), there is an instance optimal algorithm for BEST-1-ARM. Moreover, we introduce a new quantity, called the gap entropy for a best-arm problem instance, and conjecture that it is the instance-wise lower bound. Hence, resolving this conjecture would provide a final answer to the old and basic problem.

Cite this Paper


BibTeX
@InProceedings{pmlr-v49-chen16b, title = {Open Problem: Best Arm Identification: Almost Instance-Wise Optimality and the Gap Entropy Conjecture}, author = {Chen, Lijie and Li, Jian}, booktitle = {29th Annual Conference on Learning Theory}, pages = {1643--1646}, year = {2016}, editor = {Feldman, Vitaly and Rakhlin, Alexander and Shamir, Ohad}, volume = {49}, series = {Proceedings of Machine Learning Research}, address = {Columbia University, New York, New York, USA}, month = {23--26 Jun}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v49/chen16b.pdf}, url = {https://proceedings.mlr.press/v49/chen16b.html}, abstract = {The best arm identification problem (BEST-1-ARM) is the most basic pure exploration problem in stochastic multi-armed bandits. The problem has a long history and attracted significant attention for the last decade. However, we do not yet have a complete understanding of the optimal sample complexity of the problem: The state-of-the-art algorithms achieve a sample complexity of O(\sum_i=2^n \Delta_i^-2(\lnδ^-1 + \ln\ln\Delta_i^-1)) (\Delta_i is the difference between the largest mean and the i^th mean), while the best known lower bound is Ω(\sum_i=2^n \Delta_i^-2\lnδ^-1) for general instances and Ω(∆^-2 \ln\ln ∆^-1) for the two-arm instances. We propose to study the instance-wise optimality for the BEST-1-ARM problem. Previous work has proved that it is impossible to have an instance optimal algorithm for the 2-arm problem. However, we conjecture that modulo the additive term Ω(\Delta_2^-2 \ln\ln \Delta_2^-1) (which is an upper bound and worst case lower bound for the 2-arm problem), there is an instance optimal algorithm for BEST-1-ARM. Moreover, we introduce a new quantity, called the gap entropy for a best-arm problem instance, and conjecture that it is the instance-wise lower bound. Hence, resolving this conjecture would provide a final answer to the old and basic problem.} }
Endnote
%0 Conference Paper %T Open Problem: Best Arm Identification: Almost Instance-Wise Optimality and the Gap Entropy Conjecture %A Lijie Chen %A Jian Li %B 29th Annual Conference on Learning Theory %C Proceedings of Machine Learning Research %D 2016 %E Vitaly Feldman %E Alexander Rakhlin %E Ohad Shamir %F pmlr-v49-chen16b %I PMLR %P 1643--1646 %U https://proceedings.mlr.press/v49/chen16b.html %V 49 %X The best arm identification problem (BEST-1-ARM) is the most basic pure exploration problem in stochastic multi-armed bandits. The problem has a long history and attracted significant attention for the last decade. However, we do not yet have a complete understanding of the optimal sample complexity of the problem: The state-of-the-art algorithms achieve a sample complexity of O(\sum_i=2^n \Delta_i^-2(\lnδ^-1 + \ln\ln\Delta_i^-1)) (\Delta_i is the difference between the largest mean and the i^th mean), while the best known lower bound is Ω(\sum_i=2^n \Delta_i^-2\lnδ^-1) for general instances and Ω(∆^-2 \ln\ln ∆^-1) for the two-arm instances. We propose to study the instance-wise optimality for the BEST-1-ARM problem. Previous work has proved that it is impossible to have an instance optimal algorithm for the 2-arm problem. However, we conjecture that modulo the additive term Ω(\Delta_2^-2 \ln\ln \Delta_2^-1) (which is an upper bound and worst case lower bound for the 2-arm problem), there is an instance optimal algorithm for BEST-1-ARM. Moreover, we introduce a new quantity, called the gap entropy for a best-arm problem instance, and conjecture that it is the instance-wise lower bound. Hence, resolving this conjecture would provide a final answer to the old and basic problem.
RIS
TY - CPAPER TI - Open Problem: Best Arm Identification: Almost Instance-Wise Optimality and the Gap Entropy Conjecture AU - Lijie Chen AU - Jian Li BT - 29th Annual Conference on Learning Theory DA - 2016/06/06 ED - Vitaly Feldman ED - Alexander Rakhlin ED - Ohad Shamir ID - pmlr-v49-chen16b PB - PMLR DP - Proceedings of Machine Learning Research VL - 49 SP - 1643 EP - 1646 L1 - http://proceedings.mlr.press/v49/chen16b.pdf UR - https://proceedings.mlr.press/v49/chen16b.html AB - The best arm identification problem (BEST-1-ARM) is the most basic pure exploration problem in stochastic multi-armed bandits. The problem has a long history and attracted significant attention for the last decade. However, we do not yet have a complete understanding of the optimal sample complexity of the problem: The state-of-the-art algorithms achieve a sample complexity of O(\sum_i=2^n \Delta_i^-2(\lnδ^-1 + \ln\ln\Delta_i^-1)) (\Delta_i is the difference between the largest mean and the i^th mean), while the best known lower bound is Ω(\sum_i=2^n \Delta_i^-2\lnδ^-1) for general instances and Ω(∆^-2 \ln\ln ∆^-1) for the two-arm instances. We propose to study the instance-wise optimality for the BEST-1-ARM problem. Previous work has proved that it is impossible to have an instance optimal algorithm for the 2-arm problem. However, we conjecture that modulo the additive term Ω(\Delta_2^-2 \ln\ln \Delta_2^-1) (which is an upper bound and worst case lower bound for the 2-arm problem), there is an instance optimal algorithm for BEST-1-ARM. Moreover, we introduce a new quantity, called the gap entropy for a best-arm problem instance, and conjecture that it is the instance-wise lower bound. Hence, resolving this conjecture would provide a final answer to the old and basic problem. ER -
APA
Chen, L. & Li, J.. (2016). Open Problem: Best Arm Identification: Almost Instance-Wise Optimality and the Gap Entropy Conjecture. 29th Annual Conference on Learning Theory, in Proceedings of Machine Learning Research 49:1643-1646 Available from https://proceedings.mlr.press/v49/chen16b.html.

Related Material