Sven Seuken, Ruggiero Cavallo, David C. Parkes
Consider a multi-agent system with agents that have local actions, rewards, and states, all private and also with a central coordinator (the "center") that has its own actions, rewards, and states. Agents can never communicate with each other and are periodically inaccessible to the center. When accessible to the center, agents can report their local state (and models) and receive recommendations from the center about local policies to follow for the present period and also, should they become inaccessible, until becoming accessible again. The actions of the center also affect the reward and state of agents, when accessible. This provides a rich new problem class for decentralized Markov decision processes (DEC-MDPs), the partially-synchronized DEC-MDPs. But we also allow for self-interested agents, and are able to bridge to methods of dynamic mechanism design, aligning incentives so that agents report local state when accessible and choose to follow the prescribed policy of the center.
Subjects: 7.1 Multi-Agent Systems; 3.4 Probabilistic Reasoning
Submitted: Apr 16, 2008