On Policy Learning in restricted Policy Spaces

Robby Goetschalckx, Jan Ramon

We discuss the problem of policy learning in a Markov Decision Process where only a restricted, limited subset of the full policy space can be used. In this way useful background knowledge can be incoorporated to reduce the search space. This is useful when we know the optimal policy will belong to a specific subset of the full policy space, or when only a limited part of the policy space is useable in practice. We suggest and discuss a number of different approaches based onexisting work in policy search methods. None of these methods can be easily adapted to handle the setting of a restricted policy space. We point out a number of difficulties which arise and assumptions which have to be made for some approaches to work.

Subjects: 12.1 Reinforcement Learning; 15.2 Constraint Satisfaction

Submitted: Apr 12, 2007

This page is copyrighted by AAAI. All rights reserved. Your use of this site constitutes acceptance of all of AAAI's terms and conditions and privacy policy.