Collaborative Filtering on a Budget

Alexandros Karatzoglou, Alex Smola, Markus Weimer
Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, PMLR 9:389-396, 2010.

Abstract

Matrix factorization is a successful technique for building collaborative filtering systems. While it works well on a large range of problems, it is also known for requiring significant amounts of storage for each user or item to be added to the database. This is a problem whenever the collaborative filtering task is larger than the medium-sized Netflix Prize data. In this paper, we propose a new model for representing and compressing matrix factors via hashing. This allows for essentially unbounded storage (at a graceful storage / performance trade-off) for users and items to be represented in a pre-defined memory footprint. It allows us to scale recommender systems to very large numbers of users or conversely, obtain very good performance even for tiny models (e.g. 400kB of data suffice for a representation of the EachMovie problem). We provide both experimental results and approximation bounds for our compressed representation and we show how this approach can be extended to multipartite problems.

Cite this Paper


BibTeX
@InProceedings{pmlr-v9-karatzoglou10a, title = {Collaborative Filtering on a Budget}, author = {Karatzoglou, Alexandros and Smola, Alex and Weimer, Markus}, booktitle = {Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics}, pages = {389--396}, year = {2010}, editor = {Teh, Yee Whye and Titterington, Mike}, volume = {9}, series = {Proceedings of Machine Learning Research}, address = {Chia Laguna Resort, Sardinia, Italy}, month = {13--15 May}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v9/karatzoglou10a/karatzoglou10a.pdf}, url = {https://proceedings.mlr.press/v9/karatzoglou10a.html}, abstract = {Matrix factorization is a successful technique for building collaborative filtering systems. While it works well on a large range of problems, it is also known for requiring significant amounts of storage for each user or item to be added to the database. This is a problem whenever the collaborative filtering task is larger than the medium-sized Netflix Prize data. In this paper, we propose a new model for representing and compressing matrix factors via hashing. This allows for essentially unbounded storage (at a graceful storage / performance trade-off) for users and items to be represented in a pre-defined memory footprint. It allows us to scale recommender systems to very large numbers of users or conversely, obtain very good performance even for tiny models (e.g. 400kB of data suffice for a representation of the EachMovie problem). We provide both experimental results and approximation bounds for our compressed representation and we show how this approach can be extended to multipartite problems.} }
Endnote
%0 Conference Paper %T Collaborative Filtering on a Budget %A Alexandros Karatzoglou %A Alex Smola %A Markus Weimer %B Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2010 %E Yee Whye Teh %E Mike Titterington %F pmlr-v9-karatzoglou10a %I PMLR %P 389--396 %U https://proceedings.mlr.press/v9/karatzoglou10a.html %V 9 %X Matrix factorization is a successful technique for building collaborative filtering systems. While it works well on a large range of problems, it is also known for requiring significant amounts of storage for each user or item to be added to the database. This is a problem whenever the collaborative filtering task is larger than the medium-sized Netflix Prize data. In this paper, we propose a new model for representing and compressing matrix factors via hashing. This allows for essentially unbounded storage (at a graceful storage / performance trade-off) for users and items to be represented in a pre-defined memory footprint. It allows us to scale recommender systems to very large numbers of users or conversely, obtain very good performance even for tiny models (e.g. 400kB of data suffice for a representation of the EachMovie problem). We provide both experimental results and approximation bounds for our compressed representation and we show how this approach can be extended to multipartite problems.
RIS
TY - CPAPER TI - Collaborative Filtering on a Budget AU - Alexandros Karatzoglou AU - Alex Smola AU - Markus Weimer BT - Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics DA - 2010/03/31 ED - Yee Whye Teh ED - Mike Titterington ID - pmlr-v9-karatzoglou10a PB - PMLR DP - Proceedings of Machine Learning Research VL - 9 SP - 389 EP - 396 L1 - http://proceedings.mlr.press/v9/karatzoglou10a/karatzoglou10a.pdf UR - https://proceedings.mlr.press/v9/karatzoglou10a.html AB - Matrix factorization is a successful technique for building collaborative filtering systems. While it works well on a large range of problems, it is also known for requiring significant amounts of storage for each user or item to be added to the database. This is a problem whenever the collaborative filtering task is larger than the medium-sized Netflix Prize data. In this paper, we propose a new model for representing and compressing matrix factors via hashing. This allows for essentially unbounded storage (at a graceful storage / performance trade-off) for users and items to be represented in a pre-defined memory footprint. It allows us to scale recommender systems to very large numbers of users or conversely, obtain very good performance even for tiny models (e.g. 400kB of data suffice for a representation of the EachMovie problem). We provide both experimental results and approximation bounds for our compressed representation and we show how this approach can be extended to multipartite problems. ER -
APA
Karatzoglou, A., Smola, A. & Weimer, M.. (2010). Collaborative Filtering on a Budget. Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 9:389-396 Available from https://proceedings.mlr.press/v9/karatzoglou10a.html.

Related Material