Russell Greiner, Igor Jurisica
Many "learning from experience" systems use information extracted from problem solving experiences to modify a performance element PE, forming a new element PE' that can solve these and similar problems more efficiently. However, as transformations that improve performance on one set of problems can degrade performance on other sets, the new PE' is not always better than tile original PE; this depends on the distribution of problems. We therefore seek the performance element whose expected perfornamce, over this distribution, is optimal. Unfortunately, the actual distribution, which is needed to determine which element is optimal, is usually not known. Moreover, the task of finding the optimal element, even knowing the distribution, is intractable for most interesting spaces of elements. This paper presents a method, PALO, that side-steps these problems by using a set of samples to estimate the unknown distribution, and by using a set of transformations to hill-climb to a local optimum. This process is based on a mathematically rigorous form of utility analysis: in particular, it uses statistical techniques to determine whether the result of a proposed transformation will be better than the original system. We also present an efficient way of implementing this learning system in the contest of a general class of performance elements, and include empirical evidence that this approach can work effectively.