AAAI Publications, Twenty-Seventh International Conference on Automated Planning and Scheduling

Font Size: 
Approximately-Optimal Queries for Planning in Reward-Uncertain Markov Decision Processes
Shun Zhang, Edmund Durfee, Satinder Singh

Last modified: 2017-06-05


When planning actions to take on behalf of its human operator, a robot might be uncertain about its operator's reward function. We address the problem of how the robot should formulate an (approximately) optimal query to pose to the operator, given how its uncertainty affects which policies it should plan to pursue. We explain how a robot whose queries ask the operator to choose the best from among k choices can, without loss of optimality, restrict consideration to choices only over alternative policies. Further, we present a method for constructing an approximately-optimal policy query that enjoys a performance bound, where the method need not enumerate all policies. Finally, because queries posed to the operator of a robotic system are often expressed in terms of preferences over trajectories rather than policies, we show how our constructed policy query can be projected into the space of trajectory queries. Our empirical results demonstrate that our projection technique can outperform prior techniques for choosing trajectory queries, particularly when the number of trajectories the operator is asked to compare is small.

Full Text: PDF