Robert A. Hearn, Richard H. Granger
Learning to perform via reinforcement typically requires extensive search through an intractably large space of possible behaviors. In the brain, reinforcement learning is hypothesized to be carried out in large measure by the basal ganglia / striatal complex, a phylogenetically old set of structures that dominate the brains of reptiles. The striatal complex in humans is integrated into a tight loop with cortex and thalamus; the resulting cortico-striatal loops account for the vast majority of all the contents of human forebrain. Studies of these systems have led to hypotheses that the cortex is learning to construct large hierarchical representations of perceptions and actions, and that these are used to substantially constrain and direct search that would otherwise be blindly pursued by the striatal complex (as, perhaps, in reptiles). This notion has led to construction of a modular system in which loops of thalamocortical models and striatal models interact such that hierarchical representation learning in the former exerts strong constraints on the trial-and-error reinforcement learning of the latter, while reciprocally the latter can be thought of as testing hypotheses generated by the former. We report on explorations of these models in the context of learning complex behaviors by example, in simulated environments and in real robots.
Submitted: Sep 12, 2008