Topic-Based Embeddings for Learning from Large Knowledge Graphs

Changwei Hu, Piyush Rai, Lawrence Carin
Proceedings of the 19th International Conference on Artificial Intelligence and Statistics, PMLR 51:1133-1141, 2016.

Abstract

We present a scalable probabilistic framework for learning from multi-relational data given in form of entity-relation-entity triplets, with a potentially massive number of entities and relations (e.g., in multi-relational networks, knowledge bases, etc.). We define each triplet via a relation-specific bilinear function of the embeddings of entities associated with it (these embeddings correspond to “topics”). To handle massive number of relations and the data sparsity problem (very few observations per relation), we also extend this model to allow sharing of parameters across relations, which leads to a substantial reduction in the number of parameters to be learned. In addition to yielding excellent predictive performance (e.g., for knowledge base completion tasks), the interpretability of our topic-based embedding framework enables easy qualitative analyses. Computational cost of our models scales in the number of positive triplets, which makes it easy to scale to massive real-world multi-relational data sets, which are usually extremely sparse. We develop simple-to-implement batch as well as online Gibbs sampling algorithms and demonstrate the effectiveness of our models on tasks such as multi-relational link-prediction, and learning from large knowledge bases.

Cite this Paper


BibTeX
@InProceedings{pmlr-v51-hu16d, title = {Topic-Based Embeddings for Learning from Large Knowledge Graphs}, author = {Hu, Changwei and Rai, Piyush and Carin, Lawrence}, booktitle = {Proceedings of the 19th International Conference on Artificial Intelligence and Statistics}, pages = {1133--1141}, year = {2016}, editor = {Gretton, Arthur and Robert, Christian C.}, volume = {51}, series = {Proceedings of Machine Learning Research}, address = {Cadiz, Spain}, month = {09--11 May}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v51/hu16d.pdf}, url = {https://proceedings.mlr.press/v51/hu16d.html}, abstract = {We present a scalable probabilistic framework for learning from multi-relational data given in form of entity-relation-entity triplets, with a potentially massive number of entities and relations (e.g., in multi-relational networks, knowledge bases, etc.). We define each triplet via a relation-specific bilinear function of the embeddings of entities associated with it (these embeddings correspond to “topics”). To handle massive number of relations and the data sparsity problem (very few observations per relation), we also extend this model to allow sharing of parameters across relations, which leads to a substantial reduction in the number of parameters to be learned. In addition to yielding excellent predictive performance (e.g., for knowledge base completion tasks), the interpretability of our topic-based embedding framework enables easy qualitative analyses. Computational cost of our models scales in the number of positive triplets, which makes it easy to scale to massive real-world multi-relational data sets, which are usually extremely sparse. We develop simple-to-implement batch as well as online Gibbs sampling algorithms and demonstrate the effectiveness of our models on tasks such as multi-relational link-prediction, and learning from large knowledge bases.} }
Endnote
%0 Conference Paper %T Topic-Based Embeddings for Learning from Large Knowledge Graphs %A Changwei Hu %A Piyush Rai %A Lawrence Carin %B Proceedings of the 19th International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2016 %E Arthur Gretton %E Christian C. Robert %F pmlr-v51-hu16d %I PMLR %P 1133--1141 %U https://proceedings.mlr.press/v51/hu16d.html %V 51 %X We present a scalable probabilistic framework for learning from multi-relational data given in form of entity-relation-entity triplets, with a potentially massive number of entities and relations (e.g., in multi-relational networks, knowledge bases, etc.). We define each triplet via a relation-specific bilinear function of the embeddings of entities associated with it (these embeddings correspond to “topics”). To handle massive number of relations and the data sparsity problem (very few observations per relation), we also extend this model to allow sharing of parameters across relations, which leads to a substantial reduction in the number of parameters to be learned. In addition to yielding excellent predictive performance (e.g., for knowledge base completion tasks), the interpretability of our topic-based embedding framework enables easy qualitative analyses. Computational cost of our models scales in the number of positive triplets, which makes it easy to scale to massive real-world multi-relational data sets, which are usually extremely sparse. We develop simple-to-implement batch as well as online Gibbs sampling algorithms and demonstrate the effectiveness of our models on tasks such as multi-relational link-prediction, and learning from large knowledge bases.
RIS
TY - CPAPER TI - Topic-Based Embeddings for Learning from Large Knowledge Graphs AU - Changwei Hu AU - Piyush Rai AU - Lawrence Carin BT - Proceedings of the 19th International Conference on Artificial Intelligence and Statistics DA - 2016/05/02 ED - Arthur Gretton ED - Christian C. Robert ID - pmlr-v51-hu16d PB - PMLR DP - Proceedings of Machine Learning Research VL - 51 SP - 1133 EP - 1141 L1 - http://proceedings.mlr.press/v51/hu16d.pdf UR - https://proceedings.mlr.press/v51/hu16d.html AB - We present a scalable probabilistic framework for learning from multi-relational data given in form of entity-relation-entity triplets, with a potentially massive number of entities and relations (e.g., in multi-relational networks, knowledge bases, etc.). We define each triplet via a relation-specific bilinear function of the embeddings of entities associated with it (these embeddings correspond to “topics”). To handle massive number of relations and the data sparsity problem (very few observations per relation), we also extend this model to allow sharing of parameters across relations, which leads to a substantial reduction in the number of parameters to be learned. In addition to yielding excellent predictive performance (e.g., for knowledge base completion tasks), the interpretability of our topic-based embedding framework enables easy qualitative analyses. Computational cost of our models scales in the number of positive triplets, which makes it easy to scale to massive real-world multi-relational data sets, which are usually extremely sparse. We develop simple-to-implement batch as well as online Gibbs sampling algorithms and demonstrate the effectiveness of our models on tasks such as multi-relational link-prediction, and learning from large knowledge bases. ER -
APA
Hu, C., Rai, P. & Carin, L.. (2016). Topic-Based Embeddings for Learning from Large Knowledge Graphs. Proceedings of the 19th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 51:1133-1141 Available from https://proceedings.mlr.press/v51/hu16d.html.

Related Material