Behavioral Evaluation of Hanabi Rainbow DQN Agents and Rule-Based Agents

Authors

  • Rodrigo Canaan New York University
  • Xianbo Gao New York University
  • Youjin Chung New York University
  • Julian Togelius New York University
  • Andy Nealen University of Southern California,
  • Stefan Menzel Honda Research Institute Europe

DOI:

https://doi.org/10.1609/aiide.v16i1.7404

Abstract

Hanabi is a multiplayer cooperative card game, where only your partners know your cards. All players succeed or fail together. This makes the game an excellent testbed for studying collaboration. Recently, it has been shown that deep neural networks can be trained through self-play to play the game very well. However, such agents generally do not play well with others. In this paper, we investigate the consequences of training Rainbow DQN agents with human-inspired rule-based agents. We analyze with which agents Rainbow agents learn to play well, and how well playing skill transfers to agents they were not trained with. We also analyze patterns of communication between agents to elucidate how collaboration happens. A key finding is that while most agents only learn to play well with partners seen during training, one particular agent leads the Rainbow algorithm towards a much more general policy. The metrics and hypotheses advanced in this paper can be used for further study of collaborative agents.

Downloads

Published

2020-10-01

How to Cite

Canaan, R., Gao, X., Chung, Y., Togelius, J., Nealen, A., & Menzel, S. (2020). Behavioral Evaluation of Hanabi Rainbow DQN Agents and Rule-Based Agents. Proceedings of the AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment, 16(1), 31-37. https://doi.org/10.1609/aiide.v16i1.7404