Adjusting for Multiple Comparisons in Decision Tree Pruning

David Jensen, Matt Schmill

Pruning is a common technique to avoid overfitting in decision trees. Most pruning techniques do not account for one important factor --- multiple comparisons. Multiple comparisons occur when an induction algorithm examines several candidate models and selects the one that best accords with the data. Making multiple comparisons produces incorrect inferences about model accuracy. We examine a method that adjusts for multiple comparisons when pruning decision trees -- Bonferroni pruning. In experiments with artificial and realistic datasets, Bonferroni pruning produces smaller trees that are at least as accurate as trees pruned using other common approaches.

This page is copyrighted by AAAI. All rights reserved. Your use of this site constitutes acceptance of all of AAAI's terms and conditions and privacy policy.