AAAI Publications, Twenty-First International Joint Conference on Artificial Intelligence

Font Size: 
Inverse Reinforcement Learning in Partially Observable Environments
Jaedeug Choi, Kee-Eung Kim

Last modified: 2009-06-26


Inverse reinforcement learning (IRL) is the problem of recovering the underlying reward function from the behaviour of an expert. Most of the existing algorithms for IRL assume that the expert's environment is modeled as a Markov decision process (MDP), although they should be able to handle partially observable settings in order to widen the applicability to more realistic scenarios. In this paper, we present an extension of the classical IRL algorithm by Ng and Russell to partially observable environments. We discuss technical issues and challenges, and present the experimental results on some of the benchmark partially observable domains.


POMDPs; Reinforcement Learning; Sequential Decision Making

Full Text: PDF