Andrew Moore and Weng-Keen Wong
We show how a conceptually simple search operator called Optimal Reinsertion can be applied learning Bayesian Network structure from data. On each step we pick a node called the target. We delete all arcs entering or exiting the target. We then find, subject to some constraints, the globally optimal combination of in-arcs and out-arcs with which to reinsert it. The heart of the paper is a new algorithm called ORSearch which allows each optimal reinsertion step to be computed efficiently on large datasets. Our empirical results compare Optimal Reinsertion against a highly tuned implementation of multi-restart hill climbing. The results typically show one two orders of magnitude speed-up on a variety datasets. They usually show better final results, both in terms of BDEU score and in modeling future data drawn from the same distribution.