Anomaly Ranking as Supervised Bipartite Ranking

Stephan Clémençon, Sylvain Robbiano
Proceedings of the 31st International Conference on Machine Learning, PMLR 32(2):343-351, 2014.

Abstract

The Mass Volume (MV) curve is a visual tool to evaluate the performance of a scoring function with regard to its capacity to rank data in the same order as the underlying density function. Anomaly ranking refers to the unsupervised learning task which consists in building a scoring function, based on unlabeled data, with a MV curve as low as possible at any point. In this paper, it is proved that, in the case where the data generating probability distribution has compact support, anomaly ranking is equivalent to (supervised) bipartite ranking, where the goal is to discriminate between the underlying probability distribution and the uniform distribution with same support. In this situation, the MV curve can be then seen as a simple transform of the corresponding ROC curve. Exploiting this view, we then show how to use bipartite ranking algorithms, possibly combined with random sampling, to solve the MV curve minimization problem. Numerical experiments based on a variety of bipartite ranking algorithms well-documented in the literature are displayed in order to illustrate the relevance of our approach.

Cite this Paper


BibTeX
@InProceedings{pmlr-v32-clemencon14, title = {Anomaly Ranking as Supervised Bipartite Ranking}, author = {Clémençon, Stephan and Robbiano, Sylvain}, booktitle = {Proceedings of the 31st International Conference on Machine Learning}, pages = {343--351}, year = {2014}, editor = {Xing, Eric P. and Jebara, Tony}, volume = {32}, number = {2}, series = {Proceedings of Machine Learning Research}, address = {Bejing, China}, month = {22--24 Jun}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v32/clemencon14.pdf}, url = {https://proceedings.mlr.press/v32/clemencon14.html}, abstract = {The Mass Volume (MV) curve is a visual tool to evaluate the performance of a scoring function with regard to its capacity to rank data in the same order as the underlying density function. Anomaly ranking refers to the unsupervised learning task which consists in building a scoring function, based on unlabeled data, with a MV curve as low as possible at any point. In this paper, it is proved that, in the case where the data generating probability distribution has compact support, anomaly ranking is equivalent to (supervised) bipartite ranking, where the goal is to discriminate between the underlying probability distribution and the uniform distribution with same support. In this situation, the MV curve can be then seen as a simple transform of the corresponding ROC curve. Exploiting this view, we then show how to use bipartite ranking algorithms, possibly combined with random sampling, to solve the MV curve minimization problem. Numerical experiments based on a variety of bipartite ranking algorithms well-documented in the literature are displayed in order to illustrate the relevance of our approach.} }
Endnote
%0 Conference Paper %T Anomaly Ranking as Supervised Bipartite Ranking %A Stephan Clémençon %A Sylvain Robbiano %B Proceedings of the 31st International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2014 %E Eric P. Xing %E Tony Jebara %F pmlr-v32-clemencon14 %I PMLR %P 343--351 %U https://proceedings.mlr.press/v32/clemencon14.html %V 32 %N 2 %X The Mass Volume (MV) curve is a visual tool to evaluate the performance of a scoring function with regard to its capacity to rank data in the same order as the underlying density function. Anomaly ranking refers to the unsupervised learning task which consists in building a scoring function, based on unlabeled data, with a MV curve as low as possible at any point. In this paper, it is proved that, in the case where the data generating probability distribution has compact support, anomaly ranking is equivalent to (supervised) bipartite ranking, where the goal is to discriminate between the underlying probability distribution and the uniform distribution with same support. In this situation, the MV curve can be then seen as a simple transform of the corresponding ROC curve. Exploiting this view, we then show how to use bipartite ranking algorithms, possibly combined with random sampling, to solve the MV curve minimization problem. Numerical experiments based on a variety of bipartite ranking algorithms well-documented in the literature are displayed in order to illustrate the relevance of our approach.
RIS
TY - CPAPER TI - Anomaly Ranking as Supervised Bipartite Ranking AU - Stephan Clémençon AU - Sylvain Robbiano BT - Proceedings of the 31st International Conference on Machine Learning DA - 2014/06/18 ED - Eric P. Xing ED - Tony Jebara ID - pmlr-v32-clemencon14 PB - PMLR DP - Proceedings of Machine Learning Research VL - 32 IS - 2 SP - 343 EP - 351 L1 - http://proceedings.mlr.press/v32/clemencon14.pdf UR - https://proceedings.mlr.press/v32/clemencon14.html AB - The Mass Volume (MV) curve is a visual tool to evaluate the performance of a scoring function with regard to its capacity to rank data in the same order as the underlying density function. Anomaly ranking refers to the unsupervised learning task which consists in building a scoring function, based on unlabeled data, with a MV curve as low as possible at any point. In this paper, it is proved that, in the case where the data generating probability distribution has compact support, anomaly ranking is equivalent to (supervised) bipartite ranking, where the goal is to discriminate between the underlying probability distribution and the uniform distribution with same support. In this situation, the MV curve can be then seen as a simple transform of the corresponding ROC curve. Exploiting this view, we then show how to use bipartite ranking algorithms, possibly combined with random sampling, to solve the MV curve minimization problem. Numerical experiments based on a variety of bipartite ranking algorithms well-documented in the literature are displayed in order to illustrate the relevance of our approach. ER -
APA
Clémençon, S. & Robbiano, S.. (2014). Anomaly Ranking as Supervised Bipartite Ranking. Proceedings of the 31st International Conference on Machine Learning, in Proceedings of Machine Learning Research 32(2):343-351 Available from https://proceedings.mlr.press/v32/clemencon14.html.

Related Material