Sreerama Murthy, Simon Kasif, Steven Salzberg, Richard Beigel
This paper introduces OC1, a new algorithm for generating multivariate decision trees. Multivariate trees classify examples by testing linear combinations of the features at each non-leaf node of the tree. Each test is equivalent to a hyperplane at an oblique orientation to the axes. Because of the computational intractability of finding an optimal orientation for these hyperplanes, heuristic methods must be used to produce good trees. This paper explores a new method that combines deterministic and randomized procedures to search for a good tree. Experiments on several different real-world data sets demonstrate that the method consistently finds much smaller trees than comparable methods using univariate tests. In addition, the accuracy of the trees found with our method matches or exceeds the best results of other machine learning methods.