Combining Entropy Based Heuristics with Minimax Search and Temporal Differences to Play Hidden State Games

Gregory J. Calbert and Hing-Wah Kwok

In this paper, we develop a method for playing variants of spatial games like chess or checkers, where the state of the opponent is only partially observable. Each side has a number of hidden pieces invisible to opposition. An estimate of the opponent state probability distribution is made assuming moves are made to maximize the entropy of subsequent state distribution or belief. The belief state of the game at any time is specified by a probability distribution over opponent’s states and conditional on one of these states, a distribution over our states, this being the estimate of our opponent’s belief of our state. With this, we can calculate the relative uncertainty or entropy balance. We use this information balance along with other observable features and belief-based min-max search to approximate the partially observable Q-function. Gradient decent is used to learn advisor weights.


This page is copyrighted by AAAI. All rights reserved. Your use of this site constitutes acceptance of all of AAAI's terms and conditions and privacy policy.