Daniel S. Bernstein, Shlomo Zilberstein, Richard Washington, and John L. Bresina
Planetary rovers must be effective in gathering scientific data despite uncertainty and limited resources. One step toward achieving this goal is to construct a highlevel mathematical model of the problem faced by the rover and to use the model to develop a rover controller. We use the Markov decision process framework to develop a model of the rover control problem. We use Monte Carlo reinforcement learning techniques to obtain a policy from the model. The learned policy is compared to a class of heuristic policies and is found to perform better in simulation than any of the policies within that class. These preliminary results demonstrate the potential for using the Markov decision process framework along with reinforcement learning techniques to develop rover controllers.