On the Difficulty of Achieving Equilibrium in Interactive POMDPs

Prashant Doshi, Piotr Gmytrasiewicz

We analyze the asymptotic behavior of agents engaged in an infinite horizon partially observable stochastic game as formalized by the interactive POMDP framework. We show that when agents' initial beliefs satisfy a truth compatibility condition, their behavior converges to a subjective epsilon-equilibrium in a finite time, and subjective equilibrium in the limit. This result is a generalization of a similar result in repeated games, to partially observable stochastic games. However, it turns out that the equilibrating process is difficult to demonstrate computationally because of the difficulty in coming up with initial beliefs that are both natural and satisfy the truth compatibility condition. Our results, therefore, shed some negative light on using equilibria as a solution concept for decision making in partially observable stochastic games.

Subjects: 7.1 Multi-Agent Systems; 9.3 Mathematical Foundations


This page is copyrighted by AAAI. All rights reserved. Your use of this site constitutes acceptance of all of AAAI's terms and conditions and privacy policy.