Carlos Guestrin, Shobha Venkataraman, and Daphne Koller, Stanford University
We present an algorithm for coordinated decision making in cooperative multiagent settings, where the agents’ value function can be represented as a sum of context-specific value rules. The task of finding an optimal joint action in this setting leads to an algorithm where the coordination structure between agents depends on the current state of the system and even on the actual numerical values assigned to the value rules. We apply this framework to the task of multiagent planning in dynamic systems, showing how a joint value function of the associated Markov Decision Process can be approximated as a set of value rules using an efficient linear programming algorithm. The agents then apply the coordination graph algorithm at each iteration of the process to decide on the highest-value joint action, potentially leading to a different coordination pattern at each step of the plan.