AAAI Publications, Workshops at the Twenty-Seventh AAAI Conference on Artificial Intelligence

The Baseline Approach to Agent Evaluation
Josh Davidson, Christopher Archibald, Michael Bowling

Last modified: 2013-06-29


An important aspect of agent evaluation in stochastic games, especially poker, is the need to reduce the outcome variance in order to get accurate and significant results. The current method used in the Annual Computer Poker Competition’s analysis is that of duplicate poker, an approach that leverages the ability to deal sets of cards to agents in order to reduce variance. This work explores a different approach to variance reduction by using a control variate based approach known as baseline. The baseline approach involves using an agent’s outcome in self play to create an unbiased estimator for use in agent evaluation and has been shown to work well in both poker and trading agent competition domains. Base- line does not require that the agents are able to be dealt sets of cards, making it a more robust technique than duplicate. This approach is compared to the current duplicate method, as well as other variations of duplicate poker on the results of the 2011 two player no-limit and three player limit Texas Hold’em ACPC tournaments.

