A. Mohan, Z.
Chen & K. Weinberger; JMLR W&CP 14:77–89, 2011.
Web-Search Ranking with Initialized Gradient Boosted Regression Trees
In May 2010 Yahoo! Inc. hosted the Learning to Rank Challenge
. This paper
summarizes the approach by the highly placed team Washington University in St. Louis
We investigate Random Forests (RF) as a low-cost alternative algorithm to Gradient
Boosted Regression Trees (GBRT) (the de facto standard of web-search ranking). We
demonstrate that it yields surprisingly accurate ranking results — comparable to or
better than GBRT. We combine the two algorithms by ﬁrst learning a ranking function
with RF and using it as initialization
for GBRT. We refer to this setting as iGBRT.
Following a recent discussion by ?
, we show that the results of iGBRT can be improved
upon even further when the web-search ranking task is cast as classiﬁcation instead of
regression. We provide an upper bound of the Expected Reciprocal Rank (?
) in terms of
classiﬁcation error and demonstrate that iGBRT outperforms GBRT and
RF on the
Microsoft Learning to Rank and Yahoo Ranking Competition data sets with surprising
Page last modified on Wed Jan 26 10:37:05 2011.