Robert B. Doorenbos
This paper examines several systems which learn a large number of rules (productions), including one which learns 113,938 rules - the largest number ever learned by an AI system, and the largest number in any production system in existence. It is important to match these rules efficiently, in order to avoid the machine learning utility problem. Moreover, examination of such large systems reveals new phenomena and calls into question some common assumptions based on previous observations of smaller systems. We first show that the Rete and Treat match algorithms do not scale well with the number of rules in our systems, in part because the number of rules affected by a change to working memory increases with the total number of rules in these systems. We also show that the sharing of nodes in the beta part of the Rete network becomes more and more important as the number of rules increases. Finally, we describe and evaluate a new optimization for Rete which improves its scalability and allows two of our systems to learn over 100,000 rules without significant performance degradation.