Michail G. Lagoudakis, Duke University; Michael L. Littman, AT&T Labs -- Research and Duke University
Many computational problems can be solved by multiple algorithms, with different algorithms fastest for different problem sizes, input distributions, and hardware characteristics. We consider the problem of algorithm selection: dynamically choose an algorithm to attack an instance of a problem with the goal of minimizing the overall execution time. We formulate the problem as a kind of Markov decision process (MDP), and use ideas from reinforcement learning to solve it. The well known Q-learning algorithm is adapted for this case in a way that combines both Monte-Carlo and Temporal Difference methods. Our initial experiments focus on the problem of order statistic selection. The encouraging results reveal the potential of applying learning methods to traditional computational problems.