Enhanced statistical rankings via targeted data collection

Braxton Osting, Christoph Brune, Stanley Osher
Proceedings of the 30th International Conference on Machine Learning, PMLR 28(1):489-497, 2013.

Abstract

Given a graph where vertices represent alternatives and pairwise comparison data, y_ij, is given on the edges, the statistical ranking problem is to find a potential function, defined on the vertices, such that the gradient of the potential function agrees with pairwise comparisons. We study the dependence of the statistical ranking problem on the available pairwise data, i.e., pairs (i,j) for which the pairwise comparison data y_ij is known, and propose a framework to identify data which, when augmented with the current dataset, maximally increases the Fisher information of the ranking. Under certain assumptions, the data collection problem decouples, reducing to a problem of finding an edge set on the graph (with a fixed number of edges) such that the second eigenvalue of the graph Laplacian is maximal. This reduction of the data collection problem to a spectral graph-theoretic question is one of the primary contributions of this work. As an application, we study the Yahoo! Movie user rating dataset and demonstrate that the addition of a small number of well-chosen pairwise comparisons can significantly increase the Fisher informativeness of the ranking.

Cite this Paper


BibTeX
@InProceedings{pmlr-v28-osting13, title = {Enhanced statistical rankings via targeted data collection}, author = {Osting, Braxton and Brune, Christoph and Osher, Stanley}, booktitle = {Proceedings of the 30th International Conference on Machine Learning}, pages = {489--497}, year = {2013}, editor = {Dasgupta, Sanjoy and McAllester, David}, volume = {28}, number = {1}, series = {Proceedings of Machine Learning Research}, address = {Atlanta, Georgia, USA}, month = {17--19 Jun}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v28/osting13.pdf}, url = {https://proceedings.mlr.press/v28/osting13.html}, abstract = {Given a graph where vertices represent alternatives and pairwise comparison data, y_ij, is given on the edges, the statistical ranking problem is to find a potential function, defined on the vertices, such that the gradient of the potential function agrees with pairwise comparisons. We study the dependence of the statistical ranking problem on the available pairwise data, i.e., pairs (i,j) for which the pairwise comparison data y_ij is known, and propose a framework to identify data which, when augmented with the current dataset, maximally increases the Fisher information of the ranking. Under certain assumptions, the data collection problem decouples, reducing to a problem of finding an edge set on the graph (with a fixed number of edges) such that the second eigenvalue of the graph Laplacian is maximal. This reduction of the data collection problem to a spectral graph-theoretic question is one of the primary contributions of this work. As an application, we study the Yahoo! Movie user rating dataset and demonstrate that the addition of a small number of well-chosen pairwise comparisons can significantly increase the Fisher informativeness of the ranking. } }
Endnote
%0 Conference Paper %T Enhanced statistical rankings via targeted data collection %A Braxton Osting %A Christoph Brune %A Stanley Osher %B Proceedings of the 30th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2013 %E Sanjoy Dasgupta %E David McAllester %F pmlr-v28-osting13 %I PMLR %P 489--497 %U https://proceedings.mlr.press/v28/osting13.html %V 28 %N 1 %X Given a graph where vertices represent alternatives and pairwise comparison data, y_ij, is given on the edges, the statistical ranking problem is to find a potential function, defined on the vertices, such that the gradient of the potential function agrees with pairwise comparisons. We study the dependence of the statistical ranking problem on the available pairwise data, i.e., pairs (i,j) for which the pairwise comparison data y_ij is known, and propose a framework to identify data which, when augmented with the current dataset, maximally increases the Fisher information of the ranking. Under certain assumptions, the data collection problem decouples, reducing to a problem of finding an edge set on the graph (with a fixed number of edges) such that the second eigenvalue of the graph Laplacian is maximal. This reduction of the data collection problem to a spectral graph-theoretic question is one of the primary contributions of this work. As an application, we study the Yahoo! Movie user rating dataset and demonstrate that the addition of a small number of well-chosen pairwise comparisons can significantly increase the Fisher informativeness of the ranking.
RIS
TY - CPAPER TI - Enhanced statistical rankings via targeted data collection AU - Braxton Osting AU - Christoph Brune AU - Stanley Osher BT - Proceedings of the 30th International Conference on Machine Learning DA - 2013/02/13 ED - Sanjoy Dasgupta ED - David McAllester ID - pmlr-v28-osting13 PB - PMLR DP - Proceedings of Machine Learning Research VL - 28 IS - 1 SP - 489 EP - 497 L1 - http://proceedings.mlr.press/v28/osting13.pdf UR - https://proceedings.mlr.press/v28/osting13.html AB - Given a graph where vertices represent alternatives and pairwise comparison data, y_ij, is given on the edges, the statistical ranking problem is to find a potential function, defined on the vertices, such that the gradient of the potential function agrees with pairwise comparisons. We study the dependence of the statistical ranking problem on the available pairwise data, i.e., pairs (i,j) for which the pairwise comparison data y_ij is known, and propose a framework to identify data which, when augmented with the current dataset, maximally increases the Fisher information of the ranking. Under certain assumptions, the data collection problem decouples, reducing to a problem of finding an edge set on the graph (with a fixed number of edges) such that the second eigenvalue of the graph Laplacian is maximal. This reduction of the data collection problem to a spectral graph-theoretic question is one of the primary contributions of this work. As an application, we study the Yahoo! Movie user rating dataset and demonstrate that the addition of a small number of well-chosen pairwise comparisons can significantly increase the Fisher informativeness of the ranking. ER -
APA
Osting, B., Brune, C. & Osher, S.. (2013). Enhanced statistical rankings via targeted data collection. Proceedings of the 30th International Conference on Machine Learning, in Proceedings of Machine Learning Research 28(1):489-497 Available from https://proceedings.mlr.press/v28/osting13.html.

Related Material