Saher Esmeir, Shaul Markovitch
Most existing decision tree inducers are very fast due to their greedy approach. In many real-life applications, however, we are willing to allocate more time to get better decision trees. Our recently introduced LSID3 contract anytime algorithm allows computation speed to be traded for better tree quality. As a contract algorithm, LSID3 must be allocated its resources a priori, which is not always possible. In this work, we present IIDT, a general framework for interruptible induction of decision trees that need not be allocated resources a priori. The core of our proposed framework is an iterative improvement algorithm that repeatedly selects a subtree whose reconstruction is expected to yield the highest marginal utility. The algorithm then rebuilds the subtree with a higher allocation of resources. IIDT can also be configured to receive training examples as they become available, and is thus appropriate for incremental learning tasks. Empirical evaluation with several hard concepts shows that IIDT exhibits good anytime behavior and significantly outperforms greedy inducers when more time is available. A comparison of IIDT to several modern decision tree learners showed it to be superior.
Subjects: 12. Machine Learning and Discovery; 15.6 Decision Trees