AAAI Publications, The Twenty-Eighth International Flairs Conference

Font Size: 
On-Line Learning of Multi-Valued Decision Diagrams
Jean-Christophe Magnan, Pierre-Henri Wuillemin

Last modified: 2015-04-07


In the domain of decision theoretic planning, the factored framework (Factored Markov Decision Process, FMDP) has produced optimized algorithms using Decision Trees (Structured Value Iteration (SVI),Structured Policy Iteration (SPI)) or Algebraic Decision Diagrams (Stochastic Planning Using Decision Diagrams (SPUDD)). Since it may be difficult to determine the factored models of such complex stochastic environments, the framework SDYNA, which combines planning and learning algorithms using structured representations was proposed. Yet, the state-of-the-art algorithms for incremental learning, for structured decision theoretic planning or for reinforcement learning require the problem to be specified only with binary variables and/or use data structures that can be improved in term of compactness. Recently,Multi-Valued Decision Diagrams (MDDs) were proposed as more efficient data structures and planning algorithms dedicated to these data structures were provided. The purpose of this article is to study how to incrementally learn such compact factored models on discrete domains. Its main result is an online learning algorithm for MDDs that shows significant improvements both in term of quality of the learned model and in time. Finally, we show that our algorithm leads to a SDYNA framework for simultaneous learning and planning using MDDs.


Factored Representations; Incremental Learning

Full Text: PDF