Rikard König, Ulf Johansson, Lars Niklasson
Rule extraction is a technique aimed at transforming highly accurate opaque models like neural networks into comprehensible models without losing accuracy. G-REX is a rule extraction technique based on Genetic Programming that previously has performed well in several studies. This study has two objectives, to evaluate two new fitness functions for G-REX and to show how G-REX can be used as a rule inducer. The fitness functions are designed to optimize two alternative quality measures, area under ROC curves and a new comprehensibility measure called brevity. Rules with good brevity classifies typical instances with few and simple tests and use complex conditions only for atypical examples. Experiments using thirteen publicly available data sets show that the two novel fitness functions succeeded in increasing brevity and area under the ROC curve without sacrificing accuracy. When compared to a standard decision tree algorithm, G-REX achieved slightly higher accuracy, but also added additional quality to the rules by increasing their AUC or brevity significantly.
Subjects: 12. Machine Learning and Discovery; Please choose a second document classification
Submitted: Feb 25, 2008