Jia-Hong Wu, Robert Givan
In sequential decision-making problems formulated as Markov decision processes, state-value function approximation using domain features is a critical technique for scaling up the feasible problem size. We consider the problem of automatically finding useful domain features in problem domains that exhibit relational structure. Specifically we consider learning compact relational features without input from human expertise; we use neither expert decisions nor human domain knowledge beyond the basic domain definition. We propose a method to learn relational features for a linear value-function representation---numerically valued features are selected by their fit to the Bellman residual of the current value function and are automatically learned and added to the representation when needed. Starting with only a trivial feature in the value-function representation, our method finds useful value functions by combining feature learning with approximate value iteration. Empirical work presented here for Tetris and for probabilistic planning competition domains shows that our technique represents the state-of-the-art for both domain-independent feature learning and for stochastic planning in relational domains.
Subjects: 1.11 Planning; 12.1 Reinforcement Learning
Submitted: Jun 27, 2007