Parallel Algorithm for Learning Optimal Bayesian Network Structure

Yoshinori Tamada; Seiya Imoto; Satoru Miyano

We present a parallel algorithm for the score-based optimal structure search of Bayesian networks. This algorithm is based on a dynamic programming (DP) algorithm having O(n ⋅ 2ⁿ) time and space complexity, which is known to be the fastest algorithm for the optimal structure search of networks with n nodes. The bottleneck of the problem is the memory requirement, and therefore, the algorithm is currently applicable for up to a few tens of nodes. While the recently proposed algorithm overcomes this limitation by a space-time trade-off, our proposed algorithm realizes direct parallelization of the original DP algorithm with O(n^σ) time and space overhead calculations, where σ>0 controls the communication-space trade-off. The overall time and space complexity is O(n^σ+1 2ⁿ). This algorithm splits the search space so that the required communication between independent calculations is minimal. Because of this advantage, our algorithm can run on distributed memory supercomputers. Through computational experiments, we confirmed that our algorithm can run in parallel using up to 256 processors with a parallelization efficiency of 0.74, compared to the original DP algorithm with a single processor. We also demonstrate optimal structure search for a 32-node network without any constraints, which is the largest network search presented in literature.

Parallel Algorithm for Learning Optimal Bayesian Network Structure

Abstract