Blai Bonet, Hector Geffner
Dynamic Programming provides a convenient and unified framework for studying many state models used in AI but no algorithms for handling large spaces. Heuristic-search methods, on the other hand, can handle large spaces but lack a common foundation. In this work, we combine the benefits of a general dynamic programming formulation with the power of heuristic-search techniques for developing an algorithmic framework, that we call Learning Depth-First Search, that aims to be both general and effective. LDFS is a simple piece of code that performs iterated depth-first searches enhanced with learning. For deterministic actions and monotone value functions, LDFS reduces to IDA* with transposition tables, while for Game Trees, to the state-of-the-art iterated Alpha-Beta search algorithm with Null Windows known as MTD. For other models, like AND/OR graphs and MDPs, LDFS yields new, simple, and competitive algorithms. We show this here for MDPs.