François Rivest and Doina Precup
Using neural networks to represent value functions in reinforcement learning algorithms often involves a lot of work in hand-crafting the network structure, and tuning the learning parameters. In this paper, we explore the potential of using constructive neural networks in reinforcement learning. Constructive neural network methods are appealing because they can build the network structure based on the data that needs to be represented. To our knowledge, such algorithms have not been used in reinforcement learning. A major issue is that constructive algorithms often work in batch mode, while many reinforcement learning algorithms work on-line. We use a cache to accumulate data, then use a variant of cascade correlation to update the value function. Preliminary results on the game of Tic-Tac-Toe show the potential of this new algorithm, compared to using static feed-forward neural networks trained with backpropagation.