Rajarshi Das, Sreerupa Das
Moments after a baseball batter has hit a fly ball, an outfielder has to decide whether to run forward or backward to catch the ball. Judging a fly ball is a difficult task, especially when the fielder is in the plane of the ball' s trajectory. There exists several alternative hypotheses in the literature which identify different perceptual features available to the fielder that may provide useful cues as to the location of the ball’s landing point. A recent study in experimental psychology suggests that to intercept the ball, the fielder has to run such that the double derivative of tanf with respect to time is close to zero, where f is the elevation angle of the ball from the fielder’s perspective (McLeod and Dlenes 1993). We investigate whether d2 (tanf)/dt2 information is a useful cue to learn this task in the Adaptive Heuristic Critic (AHC) reinforcement learning framework. Our results provide supporting evidence that d2(tanf)/dt2 information furnishes strong initial cue in determining the landing point of the ball and plays a key role in the learning process. However our simulations show that during later stages of the ball’s flight, yet another perceptual feature, the perpendicular velocity of the ball (vp) with respect to the fielder, provides stronger cues as to the location of the landing point. The trained network generalized to novel circumstances and also exhibited some of the behaviors recorded by experimental psychologists on human data. We believe that much can be gained by using reinforcement learning approaches to learn common physical tasks, and similarly motivated work could stimulate useful interdisciplinary research on the subject.