Ke Wang and Suman Sundaresh
Feature selection is a data preprocessing step for classification and data mining tasks. Traditionally, feature selection is done by selecting a minimum number of features that determine the class label, i.e., by the horizontal compactness of data. In this paper, we propose a new selection criterion that aims at the vertical compactness of data. In particular, we select a subset of features that yields the least number of projected instances while determining the class label. A hybrid search that is partially DFS and partially BFS is proposed to exploit the pruning potential of the problem. We compare the result induced by C4.5 before and after the feature selection.