AAAI Publications, The Twenty-Sixth International FLAIRS Conference

Font Size: 
Exploiting Key Events for Learning Interception Policies
Yuan Chang, Gita Reese Sukthankar

Last modified: 2013-05-19

Abstract


One scenario that commonly arises in computer games and military training simulations is predator-prey pursuit in which the goal of the non-player character agent is to successfully intercept a fleeing player. In this paper, we focus on a variant of the problem in which the agent does not have perfect information about the player’s location but has prior experience in combating the player. Effectively addressing this problem requires a combination of learning the opponent’s tactics while planning an interception strategy. Although for small maps, solving the problem with standard POMDP (Partially Observable Markov Decision Process) solvers is feasible, increasing the search area renders many standard techniques intractable due to the increase in the belief state size and required plan length. Here we introduce a new approach for solving the problem on large maps that exploits key events, high reward regions in the belief state discovered at the higher level of abstraction, to plan efficiently over the low-level map. We demonstrate that our hierarchical key-events planner can learn intercept policies from traces of previous pursuits significantly faster than a standard point-based POMDP solver, particularly as the maps scale in size.

Full Text: PDF