Non-negative Matrix Factorization for Discrete Data with Hierarchical Side-Information

Changwei Hu, Piyush Rai, Lawrence Carin
Proceedings of the 19th International Conference on Artificial Intelligence and Statistics, PMLR 51:1124-1132, 2016.

Abstract

We present a probabilistic framework for efficient non-negative matrix factorization of discrete (count/binary) data with side-information. The side-information is given as a multi-level structure, taxonomy, or ontology, with nodes at each level being categorical-valued observations. For example, when modeling documents with a two-level side-information (documents being at level-zero), level-one may represent (one or more) authors associated with each document and level-two may represent affiliations of each author. The model easily generalizes to more than two levels (or taxonomy/ontology of arbitrary depth). Our model can learn embeddings of entities present at each level in the data/side-information hierarchy (e.g., documents, authors, affiliations, in the previous example), with appropriate sharing of information across levels. The model also enjoys full local conjugacy, facilitating efficient Gibbs sampling for model inference. Inference cost scales in the number of non-zero entries in the data matrix, which is especially appealing for real-world massive but sparse matrices. We demonstrate the effectiveness of the model on several real-world data sets.

Cite this Paper


BibTeX
@InProceedings{pmlr-v51-hu16c, title = {Non-negative Matrix Factorization for Discrete Data with Hierarchical Side-Information}, author = {Hu, Changwei and Rai, Piyush and Carin, Lawrence}, booktitle = {Proceedings of the 19th International Conference on Artificial Intelligence and Statistics}, pages = {1124--1132}, year = {2016}, editor = {Gretton, Arthur and Robert, Christian C.}, volume = {51}, series = {Proceedings of Machine Learning Research}, address = {Cadiz, Spain}, month = {09--11 May}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v51/hu16c.pdf}, url = {https://proceedings.mlr.press/v51/hu16c.html}, abstract = {We present a probabilistic framework for efficient non-negative matrix factorization of discrete (count/binary) data with side-information. The side-information is given as a multi-level structure, taxonomy, or ontology, with nodes at each level being categorical-valued observations. For example, when modeling documents with a two-level side-information (documents being at level-zero), level-one may represent (one or more) authors associated with each document and level-two may represent affiliations of each author. The model easily generalizes to more than two levels (or taxonomy/ontology of arbitrary depth). Our model can learn embeddings of entities present at each level in the data/side-information hierarchy (e.g., documents, authors, affiliations, in the previous example), with appropriate sharing of information across levels. The model also enjoys full local conjugacy, facilitating efficient Gibbs sampling for model inference. Inference cost scales in the number of non-zero entries in the data matrix, which is especially appealing for real-world massive but sparse matrices. We demonstrate the effectiveness of the model on several real-world data sets.} }
Endnote
%0 Conference Paper %T Non-negative Matrix Factorization for Discrete Data with Hierarchical Side-Information %A Changwei Hu %A Piyush Rai %A Lawrence Carin %B Proceedings of the 19th International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2016 %E Arthur Gretton %E Christian C. Robert %F pmlr-v51-hu16c %I PMLR %P 1124--1132 %U https://proceedings.mlr.press/v51/hu16c.html %V 51 %X We present a probabilistic framework for efficient non-negative matrix factorization of discrete (count/binary) data with side-information. The side-information is given as a multi-level structure, taxonomy, or ontology, with nodes at each level being categorical-valued observations. For example, when modeling documents with a two-level side-information (documents being at level-zero), level-one may represent (one or more) authors associated with each document and level-two may represent affiliations of each author. The model easily generalizes to more than two levels (or taxonomy/ontology of arbitrary depth). Our model can learn embeddings of entities present at each level in the data/side-information hierarchy (e.g., documents, authors, affiliations, in the previous example), with appropriate sharing of information across levels. The model also enjoys full local conjugacy, facilitating efficient Gibbs sampling for model inference. Inference cost scales in the number of non-zero entries in the data matrix, which is especially appealing for real-world massive but sparse matrices. We demonstrate the effectiveness of the model on several real-world data sets.
RIS
TY - CPAPER TI - Non-negative Matrix Factorization for Discrete Data with Hierarchical Side-Information AU - Changwei Hu AU - Piyush Rai AU - Lawrence Carin BT - Proceedings of the 19th International Conference on Artificial Intelligence and Statistics DA - 2016/05/02 ED - Arthur Gretton ED - Christian C. Robert ID - pmlr-v51-hu16c PB - PMLR DP - Proceedings of Machine Learning Research VL - 51 SP - 1124 EP - 1132 L1 - http://proceedings.mlr.press/v51/hu16c.pdf UR - https://proceedings.mlr.press/v51/hu16c.html AB - We present a probabilistic framework for efficient non-negative matrix factorization of discrete (count/binary) data with side-information. The side-information is given as a multi-level structure, taxonomy, or ontology, with nodes at each level being categorical-valued observations. For example, when modeling documents with a two-level side-information (documents being at level-zero), level-one may represent (one or more) authors associated with each document and level-two may represent affiliations of each author. The model easily generalizes to more than two levels (or taxonomy/ontology of arbitrary depth). Our model can learn embeddings of entities present at each level in the data/side-information hierarchy (e.g., documents, authors, affiliations, in the previous example), with appropriate sharing of information across levels. The model also enjoys full local conjugacy, facilitating efficient Gibbs sampling for model inference. Inference cost scales in the number of non-zero entries in the data matrix, which is especially appealing for real-world massive but sparse matrices. We demonstrate the effectiveness of the model on several real-world data sets. ER -
APA
Hu, C., Rai, P. & Carin, L.. (2016). Non-negative Matrix Factorization for Discrete Data with Hierarchical Side-Information. Proceedings of the 19th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 51:1124-1132 Available from https://proceedings.mlr.press/v51/hu16c.html.

Related Material