Pat Langley, Stephanie Sage
In this paper, we examine an approach to feature selection designed to handle domains that involve both irrelevant and interacting features. We review the reasons this situation poses challenges to both nearest neighbor and decision-tree methods, then describe a new algorithm - OBLIVION - that carries out greedy pruning of oblivious decision trees. We summarize the results of experiments with artificial domains, which show that OBLIVION'S sample complexity grows slowly with the number of irrelevant features, and with natural domains, which suggest that few existing data sets contain many irrelevant features. In closing, we consider other work on feature selection and outline directions for future research.