Efficient Dimensionality Reduction for High-Dimensional Network Estimation

Safiye Celik, Benjamin Logsdon, Su-In Lee
Proceedings of the 31st International Conference on Machine Learning, PMLR 32(2):1953-1961, 2014.

Abstract

We propose module graphical lasso (MGL), an aggressive dimensionality reduction and network estimation technique for a high-dimensional Gaussian graphical model (GGM). MGL achieves scalability, interpretability and robustness by exploiting the modularity property of many real-world networks. Variables are organized into tightly coupled modules and a graph structure is estimated to determine the conditional independencies among modules. MGL iteratively learns the module assignment of variables, the latent variables, each corresponding to a module, and the parameters of the GGM of the latent variables. In synthetic data experiments, MGL outperforms the standard graphical lasso and three other methods that incorporate latent variables into GGMs. When applied to gene expression data from ovarian cancer, MGL outperforms standard clustering algorithms in identifying functionally coherent gene sets and predicting survival time of patients. The learned modules and their dependencies provide novel insights into cancer biology as well as identifying possible novel drug targets.

Cite this Paper


BibTeX
@InProceedings{pmlr-v32-celik14, title = {Efficient Dimensionality Reduction for High-Dimensional Network Estimation}, author = {Celik, Safiye and Logsdon, Benjamin and Lee, Su-In}, booktitle = {Proceedings of the 31st International Conference on Machine Learning}, pages = {1953--1961}, year = {2014}, editor = {Xing, Eric P. and Jebara, Tony}, volume = {32}, number = {2}, series = {Proceedings of Machine Learning Research}, address = {Bejing, China}, month = {22--24 Jun}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v32/celik14.pdf}, url = {https://proceedings.mlr.press/v32/celik14.html}, abstract = {We propose module graphical lasso (MGL), an aggressive dimensionality reduction and network estimation technique for a high-dimensional Gaussian graphical model (GGM). MGL achieves scalability, interpretability and robustness by exploiting the modularity property of many real-world networks. Variables are organized into tightly coupled modules and a graph structure is estimated to determine the conditional independencies among modules. MGL iteratively learns the module assignment of variables, the latent variables, each corresponding to a module, and the parameters of the GGM of the latent variables. In synthetic data experiments, MGL outperforms the standard graphical lasso and three other methods that incorporate latent variables into GGMs. When applied to gene expression data from ovarian cancer, MGL outperforms standard clustering algorithms in identifying functionally coherent gene sets and predicting survival time of patients. The learned modules and their dependencies provide novel insights into cancer biology as well as identifying possible novel drug targets.} }
Endnote
%0 Conference Paper %T Efficient Dimensionality Reduction for High-Dimensional Network Estimation %A Safiye Celik %A Benjamin Logsdon %A Su-In Lee %B Proceedings of the 31st International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2014 %E Eric P. Xing %E Tony Jebara %F pmlr-v32-celik14 %I PMLR %P 1953--1961 %U https://proceedings.mlr.press/v32/celik14.html %V 32 %N 2 %X We propose module graphical lasso (MGL), an aggressive dimensionality reduction and network estimation technique for a high-dimensional Gaussian graphical model (GGM). MGL achieves scalability, interpretability and robustness by exploiting the modularity property of many real-world networks. Variables are organized into tightly coupled modules and a graph structure is estimated to determine the conditional independencies among modules. MGL iteratively learns the module assignment of variables, the latent variables, each corresponding to a module, and the parameters of the GGM of the latent variables. In synthetic data experiments, MGL outperforms the standard graphical lasso and three other methods that incorporate latent variables into GGMs. When applied to gene expression data from ovarian cancer, MGL outperforms standard clustering algorithms in identifying functionally coherent gene sets and predicting survival time of patients. The learned modules and their dependencies provide novel insights into cancer biology as well as identifying possible novel drug targets.
RIS
TY - CPAPER TI - Efficient Dimensionality Reduction for High-Dimensional Network Estimation AU - Safiye Celik AU - Benjamin Logsdon AU - Su-In Lee BT - Proceedings of the 31st International Conference on Machine Learning DA - 2014/06/18 ED - Eric P. Xing ED - Tony Jebara ID - pmlr-v32-celik14 PB - PMLR DP - Proceedings of Machine Learning Research VL - 32 IS - 2 SP - 1953 EP - 1961 L1 - http://proceedings.mlr.press/v32/celik14.pdf UR - https://proceedings.mlr.press/v32/celik14.html AB - We propose module graphical lasso (MGL), an aggressive dimensionality reduction and network estimation technique for a high-dimensional Gaussian graphical model (GGM). MGL achieves scalability, interpretability and robustness by exploiting the modularity property of many real-world networks. Variables are organized into tightly coupled modules and a graph structure is estimated to determine the conditional independencies among modules. MGL iteratively learns the module assignment of variables, the latent variables, each corresponding to a module, and the parameters of the GGM of the latent variables. In synthetic data experiments, MGL outperforms the standard graphical lasso and three other methods that incorporate latent variables into GGMs. When applied to gene expression data from ovarian cancer, MGL outperforms standard clustering algorithms in identifying functionally coherent gene sets and predicting survival time of patients. The learned modules and their dependencies provide novel insights into cancer biology as well as identifying possible novel drug targets. ER -
APA
Celik, S., Logsdon, B. & Lee, S.. (2014). Efficient Dimensionality Reduction for High-Dimensional Network Estimation. Proceedings of the 31st International Conference on Machine Learning, in Proceedings of Machine Learning Research 32(2):1953-1961 Available from https://proceedings.mlr.press/v32/celik14.html.

Related Material