Itai Ashlagi, Dov Monderer, Moshe Tennenholtz
We consider a resource selection game with incomplete information about the resource-cost functions. All the players know is the set of players, an upper bound on the possible costs, and that the cost functions are positive and nondecreasing. The game is played repeatedly and after every stage each player observes her cost, and the actions of all players. For every ε>0 we prove the existence of a learning ε-equilibrium, which is a profile of algorithms, one for each player such that a unilateral deviation of a player is, up to ε not beneficial for her regardless of the actual cost functions. Furthermore, the learning equilibrium yields an optimal social cost.
Subjects: 7.1 Multi-Agent Systems; 12.1 Reinforcement Learning
Submitted: Apr 22, 2007