Bohdana Ratitch, Swaminathan Mahadevan, and Doina Precup
In this paper, we advocate the use of Sparse Distributed Memories (SDMs) for on-line, value-based reinforcement learning (RL). The SDMs model was originally designed for the case, where a very large input (address) space has to be mapped into a much smaller physical memory. SDMs provide a linear, local function approximation scheme, which is often preferred in RL. In our recent work, we developed an algorithm for learning simultaneously the structure and the content of the memory on-line. In this paper, we investigate the empirical performance of the Sarsa algorithm using the SDM function approximator on three domains: the traditional Mountain-car task, a variant of a hunter-prey task and a motor-control problem called Swimmer. The second and third tasks are highly-dimensional and exhibit complex dynamics, yet our approach provides good solutions.