John W. Sheppard and Steven L. Salzberg
A number of special-purpose learning techniques have been developed in recent years to address the problem of learning with delayed reinforcement. This category includes numerous important control problems that arise in robotics, planning, and other areas. However, very few researchers have attempted to apply memorybased techniques to these tasks. We explore the performance of a common memory-based technique, nearest neighbor learning, on a non-trivial delayed reinforcement task. The task requires the machine to take the role of an airplane that must learn to evade pursuing missiles. The goal of learning is to find a relatively small number of exemplars that can be used to perform the task well. Because a prior study showed that nearest neighbor had great difficulty performing this task, we decided to use genetic algorithms as a bootstrapping method to provide the examples. We then edited the examples further to reduce the size of memory. Our new experiments demonstrate that the bootstrapping method resulted in a dramatic improvement in the performance of the memory-based approach, in terms of both overall accuracy and the size of memory.