Mehran Asadi, Manfred Huber
This paper presents a new method for the autonomous construction of hierarchical action and state representations in reinforcement learning, aimed at accelerating learning and extending the scope of such systems. In this approach, the agent uses information acquired while learning one task to discover subgoals by analyzing the learned policy using Monte Carlo sampling. By creating useful new subgoals and by off-line learning corresponding subtask policies as abstract actions, the agent is able to transfer knowledge to subsequent tasks and to accelerate learning. At the same time, the subgoal actions are used to construct a more abstract state representation using action-dependent approximate state space partitioning. This representation forms a new level in a state space hierarchy and serves as the initial representation for new learning tasks. In order to ensure that tasks are learnable, value functions are built simultaneously at different levels and inconsistencies are used to identify actions to be used to refine relevant portions of the abstract state space. Together these techniques permit the agent to form more abstract action and state representations over time. Experiments in deterministic and stochastic domains show that this method can significantly outperform learning on a flat state space representation.
Subjects: 12. Machine Learning and Discovery; 12.1 Reinforcement Learning
Submitted: Apr 5, 2005