Dynamic Programming with Stochastic Opponent Models in Social Games: A Preliminary Report

Tsz-Chiu Au

Policy makers often confront with the following problem: how best their organization can repeatedly interact with other organizations such that the long-term utility of their organization can be maximized? This problem is difficult because policy makers usually know very little about other organizations, and therefore they cannot make perfect predictions about the other organizations’ behaviors. In this paper, we formulate this problem as social games in which (1) there are two or more agents interacting with each other; (2) each agent can perform more than one action in each interaction; and (3) the payoff matrix is not fixed; the payoff matrix varies from one situation to another. We devised a dynamic programming algorithm to compute a policy given the model of the other agent’s behavior, written in a language called SOMAprograms, a rich language for representing agent’s incomplete belief about the other agents’ behavior.

Subjects: 7.1 Multi-Agent Systems; 12.1 Reinforcement Learning

Submitted: Jun 20, 2008

This page is copyrighted by AAAI. All rights reserved. Your use of this site constitutes acceptance of all of AAAI's terms and conditions and privacy policy.