No-Regret Algorithms for Heavy-Tailed Linear Bandits

Andres Munoz Medina, Scott Yang
Proceedings of The 33rd International Conference on Machine Learning, PMLR 48:1642-1650, 2016.

Abstract

We analyze the problem of linear bandits under heavy tailed noise. Most of of the work on linear bandits has been based on the assumption of bounded or sub-Gaussian noise. However, this assumption is often violated in common scenarios such as financial markets. We present two algorithms to tackle this problem: one based on dynamic truncation and one based on a median of means estimator. We show that, when the noise admits admits only a 1 + εmoment, these algorithms are still able to achieve regret in \widetildeO(T^\frac2 + ε2(1 + ε)) and \widetildeO(T^\frac1+ 2ε1 + 3 ε) respectively. In particular, they guarantee sublinear regret as long as the noise has finite variance. We also present empirical results showing that our algorithms achieve a better performance than the current state of the art for bounded noise when the L_∞bound on the noise is large yet the 1 + εmoment of the noise is small.

Cite this Paper


BibTeX
@InProceedings{pmlr-v48-medina16, title = {No-Regret Algorithms for Heavy-Tailed Linear Bandits}, author = {Medina, Andres Munoz and Yang, Scott}, booktitle = {Proceedings of The 33rd International Conference on Machine Learning}, pages = {1642--1650}, year = {2016}, editor = {Balcan, Maria Florina and Weinberger, Kilian Q.}, volume = {48}, series = {Proceedings of Machine Learning Research}, address = {New York, New York, USA}, month = {20--22 Jun}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v48/medina16.pdf}, url = {https://proceedings.mlr.press/v48/medina16.html}, abstract = {We analyze the problem of linear bandits under heavy tailed noise. Most of of the work on linear bandits has been based on the assumption of bounded or sub-Gaussian noise. However, this assumption is often violated in common scenarios such as financial markets. We present two algorithms to tackle this problem: one based on dynamic truncation and one based on a median of means estimator. We show that, when the noise admits admits only a 1 + εmoment, these algorithms are still able to achieve regret in \widetildeO(T^\frac2 + ε2(1 + ε)) and \widetildeO(T^\frac1+ 2ε1 + 3 ε) respectively. In particular, they guarantee sublinear regret as long as the noise has finite variance. We also present empirical results showing that our algorithms achieve a better performance than the current state of the art for bounded noise when the L_∞bound on the noise is large yet the 1 + εmoment of the noise is small.} }
Endnote
%0 Conference Paper %T No-Regret Algorithms for Heavy-Tailed Linear Bandits %A Andres Munoz Medina %A Scott Yang %B Proceedings of The 33rd International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2016 %E Maria Florina Balcan %E Kilian Q. Weinberger %F pmlr-v48-medina16 %I PMLR %P 1642--1650 %U https://proceedings.mlr.press/v48/medina16.html %V 48 %X We analyze the problem of linear bandits under heavy tailed noise. Most of of the work on linear bandits has been based on the assumption of bounded or sub-Gaussian noise. However, this assumption is often violated in common scenarios such as financial markets. We present two algorithms to tackle this problem: one based on dynamic truncation and one based on a median of means estimator. We show that, when the noise admits admits only a 1 + εmoment, these algorithms are still able to achieve regret in \widetildeO(T^\frac2 + ε2(1 + ε)) and \widetildeO(T^\frac1+ 2ε1 + 3 ε) respectively. In particular, they guarantee sublinear regret as long as the noise has finite variance. We also present empirical results showing that our algorithms achieve a better performance than the current state of the art for bounded noise when the L_∞bound on the noise is large yet the 1 + εmoment of the noise is small.
RIS
TY - CPAPER TI - No-Regret Algorithms for Heavy-Tailed Linear Bandits AU - Andres Munoz Medina AU - Scott Yang BT - Proceedings of The 33rd International Conference on Machine Learning DA - 2016/06/11 ED - Maria Florina Balcan ED - Kilian Q. Weinberger ID - pmlr-v48-medina16 PB - PMLR DP - Proceedings of Machine Learning Research VL - 48 SP - 1642 EP - 1650 L1 - http://proceedings.mlr.press/v48/medina16.pdf UR - https://proceedings.mlr.press/v48/medina16.html AB - We analyze the problem of linear bandits under heavy tailed noise. Most of of the work on linear bandits has been based on the assumption of bounded or sub-Gaussian noise. However, this assumption is often violated in common scenarios such as financial markets. We present two algorithms to tackle this problem: one based on dynamic truncation and one based on a median of means estimator. We show that, when the noise admits admits only a 1 + εmoment, these algorithms are still able to achieve regret in \widetildeO(T^\frac2 + ε2(1 + ε)) and \widetildeO(T^\frac1+ 2ε1 + 3 ε) respectively. In particular, they guarantee sublinear regret as long as the noise has finite variance. We also present empirical results showing that our algorithms achieve a better performance than the current state of the art for bounded noise when the L_∞bound on the noise is large yet the 1 + εmoment of the noise is small. ER -
APA
Medina, A.M. & Yang, S.. (2016). No-Regret Algorithms for Heavy-Tailed Linear Bandits. Proceedings of The 33rd International Conference on Machine Learning, in Proceedings of Machine Learning Research 48:1642-1650 Available from https://proceedings.mlr.press/v48/medina16.html.

Related Material