Timothy Huang, Graeme Connell, and Bryan McQuade
In this paper, we present an experimental methodology and results for a machine learning approach to learning opening strategy in the game of Go, a game for which the best computer programs play only at the level of an advanced beginning human player. While the evaluation function in most computer Go programs consists of a carefully crafted combination of pattern matchers, expert rules, and selective search, we employ a neural network trained by self-play using temporal difference learning. Our focus is on the sequence of moves made at the beginning of the game. Experimental results indicate that our approach is effective for learning opening strategy, and they also identify higherlevel features of the game that improve the quality of the learned evaluation function.