Pruning Irrelevant Features from Oblivious Decision Trees

Pat Langley, Stephanie Sage

In this paper, we examine an approach to feature selection designed to handle domains that involve both irrelevant and interacting features. We review the reasons this situation poses challenges to both nearest neighbor and decision-tree methods, then describe a new algorithm - OBLIVION - that carries out greedy pruning of oblivious decision trees. We summarize the results of experiments with artificial domains, which show that OBLIVION'S sample complexity grows slowly with the number of irrelevant features, and with natural domains, which suggest that few existing data sets contain many irrelevant features. In closing, we consider other work on feature selection and outline directions for future research.


This page is copyrighted by AAAI. All rights reserved. Your use of this site constitutes acceptance of all of AAAI's terms and conditions and privacy policy.