Deep Boltzmann Machines

Ruslan Salakhutdinov; Geoffrey Hinton

Deep Boltzmann Machines

Ruslan Salakhutdinov, Geoffrey Hinton

Proceedings of the Twelth International Conference on Artificial Intelligence and Statistics, PMLR 5:448-455, 2009.

Abstract

We present a new learning algorithm for Boltzmann machines that contain many layers of hidden variables. Data-dependent expectations are estimated using a variational approximation that tends to focus on a single mode, and data-independent expectations are approximated using persistent Markov chains. The use of two quite different techniques for estimating the two types of expectation that enter into the gradient of the log-likelihood makes it practical to learn Boltzmann machines with multiple hidden layers and millions of parameters. The learning can be made more efficient by using a layer-by-layer “pre-training” phase that allows variational inference to be initialized by a single bottom-up pass. We present results on the MNIST and NORB datasets showing that deep Boltzmann machines learn good generative models and perform well on handwritten digit and visual object recognition tasks.

Cite this Paper

BibTeX


@InProceedings{pmlr-v5-salakhutdinov09a,
  title = 	 {Deep Boltzmann Machines},
  author = 	 {Salakhutdinov, Ruslan and Hinton, Geoffrey},
  booktitle = 	 {Proceedings of the Twelth International Conference on Artificial Intelligence and Statistics},
  pages = 	 {448--455},
  year = 	 {2009},
  editor = 	 {van Dyk, David and Welling, Max},
  volume = 	 {5},
  series = 	 {Proceedings of Machine Learning Research},
  address = 	 {Hilton Clearwater Beach Resort, Clearwater Beach, Florida USA},
  month = 	 {16--18 Apr},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v5/salakhutdinov09a/salakhutdinov09a.pdf},
  url = 	 {https://proceedings.mlr.press/v5/salakhutdinov09a.html},
  abstract = 	 {We present a new learning algorithm for Boltzmann machines that contain many layers of hidden variables. Data-dependent expectations are estimated using a variational approximation that tends to focus on a single mode, and data-independent expectations are approximated using persistent Markov chains. The use of two quite different techniques for estimating the two types of expectation that enter into the gradient of the log-likelihood makes it practical to learn Boltzmann machines with multiple hidden layers and millions of parameters. The learning can be made more efficient by using a layer-by-layer “pre-training” phase that allows variational inference to be initialized by a single bottom-up pass. We present results on the MNIST and NORB datasets showing that deep Boltzmann machines learn good generative models and perform well on handwritten digit and visual object recognition tasks.}
}

Endnote

%0 Conference Paper
%T Deep Boltzmann Machines
%A Ruslan Salakhutdinov
%A Geoffrey Hinton
%B Proceedings of the Twelth International Conference on Artificial Intelligence and Statistics
%C Proceedings of Machine Learning Research
%D 2009
%E David van Dyk
%E Max Welling	
%F pmlr-v5-salakhutdinov09a
%I PMLR
%P 448--455
%U https://proceedings.mlr.press/v5/salakhutdinov09a.html
%V 5
%X We present a new learning algorithm for Boltzmann machines that contain many layers of hidden variables. Data-dependent expectations are estimated using a variational approximation that tends to focus on a single mode, and data-independent expectations are approximated using persistent Markov chains. The use of two quite different techniques for estimating the two types of expectation that enter into the gradient of the log-likelihood makes it practical to learn Boltzmann machines with multiple hidden layers and millions of parameters. The learning can be made more efficient by using a layer-by-layer “pre-training” phase that allows variational inference to be initialized by a single bottom-up pass. We present results on the MNIST and NORB datasets showing that deep Boltzmann machines learn good generative models and perform well on handwritten digit and visual object recognition tasks.

RIS


TY  - CPAPER
TI  - Deep Boltzmann Machines
AU  - Ruslan Salakhutdinov
AU  - Geoffrey Hinton
BT  - Proceedings of the Twelth International Conference on Artificial Intelligence and Statistics
DA  - 2009/04/15
ED  - David van Dyk
ED  - Max Welling	
ID  - pmlr-v5-salakhutdinov09a
PB  - PMLR
DP  - Proceedings of Machine Learning Research
VL  - 5
SP  - 448
EP  - 455
L1  - http://proceedings.mlr.press/v5/salakhutdinov09a/salakhutdinov09a.pdf
UR  - https://proceedings.mlr.press/v5/salakhutdinov09a.html
AB  - We present a new learning algorithm for Boltzmann machines that contain many layers of hidden variables. Data-dependent expectations are estimated using a variational approximation that tends to focus on a single mode, and data-independent expectations are approximated using persistent Markov chains. The use of two quite different techniques for estimating the two types of expectation that enter into the gradient of the log-likelihood makes it practical to learn Boltzmann machines with multiple hidden layers and millions of parameters. The learning can be made more efficient by using a layer-by-layer “pre-training” phase that allows variational inference to be initialized by a single bottom-up pass. We present results on the MNIST and NORB datasets showing that deep Boltzmann machines learn good generative models and perform well on handwritten digit and visual object recognition tasks.
ER  -

APA


Salakhutdinov, R. & Hinton, G.. (2009). Deep Boltzmann Machines. Proceedings of the Twelth International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 5:448-455 Available from https://proceedings.mlr.press/v5/salakhutdinov09a.html.

Related Material

Download PDF