Junling Hu and Michael P. Wellman, University of Michigan, USA
Learning in a multiagent environment is complicated by the fact that as other agents learn, the environment effectively changes. Moreover, other agents’ actions are often not directly observable, and the actions taken by the learning agent can strongly bias which range of behaviors are encountered. We define the concept of a self-fulfilling equilibrium, where all agents’ expectations are realized, and each agent responds optimally to its expectations. We present a generic multiagent exchange situation, in which the competitive equilibrium is self-fulfilling. We then introduce an agent that executes a more sophisticated strategic learning strategy, building a model of the response of other agents. We found that the system reliably converges to a self-fulfilling equilibrium, but that depending on its initial belief, the strategic learning agent may be better or worse off than had it not attempted to learn a model of the other agents at all.