Jacob W. Crandall and Michael A. Goodrich
The goal of an agent playing a repeated game is to maximize its payoffs over time. In repeated games between other learning agents, this often requires that an agent must learn to offer and accept profitable compromises. To do so, past research suggests that agents must implement both teaching and following strategies. However, few algorithms successfully employ both kinds of strategies simultaneously. In this paper, we present an algorithm (called SPaM) that employs both kinds of strategies simultaneously in 2-player matrix games when the complete game matrix is observable. We show (empirically) that SPaM learns quickly and effectively when associating with a large class of agents, including self, best response learners, and (perhaps) humans.