Learning Thin Junction Trees via Graph Cuts
Shahaf Dafna, Carlos Guestrin; JMLR W&CP 5:113-120, 2009.
Structure learning algorithms usually focus on the compactness of the learned model. However, compact representation does not imply the existence of tractable inference algorithms: both exact and approximate inference may still be NP-hard. This focus on compactness leads to learning good models that require approximate inference techniques, thus reducing their prediction quality. In this paper, we propose a method for learning an attractive class of models: bounded treewidth junction trees. Those models permit both compact representation of probability distributions and efficient exact inference. Our method uses a new global criterion to construct the tree. Our criterion, based on a Bethe approximation of the likelihood, transforms the structure learning problem into an intuitive graph theoretical task. We present an efficient randomized algorithm with theoretical guarantees for finding good separators. We recursively apply this procedure to obtain a thin junction tree. Our extensive empirical evaluation demonstrates the benefit of applying exact inference using our models to answer queries. We also extend our technique to learn low tree-width conditional random fields, and demonstrate significant improvements over state of the art block-L1 regularization techniques.