Jilles Steeve Dibangoye, Brahim Chaib-draa, Abdel-illah Mouaddib
We address the problem of computing an optimal value function for Markov decision processes. Since finding this function quickly and accurately requires substantial computation effort, techniques that accelerate fundamental algorithms have been a main focus of research. Among them prioritization solvers suggest solutions to the problem of ordering backup operations. Prioritization techniques for ordering the sequence of backup operations reduce the number of needed backups considerably, but involve significant overhead. This paper provides a new way to order backups, based on a mapping of states space into a metric space. Empirical evaluation verifies that our method achieves the best balance between the number of backups executed and the effort required to prioritized backups, showing order of magnitude improvement in runtime over number of benchmarks.
Subjects: 1.11 Planning; 12.1 Reinforcement Learning
Submitted: Feb 18, 2008