Desiderata for Interpretability: Explaining Decision Tree Predictions with Counterfactuals
Explanations in machine learning come in many forms, but a consensus regarding their desired properties is still emerging. In our work we collect and organise these explainability desiderata and discuss how they can be used to systematically evaluate properties and quality of an explainable system using the case of class-contrastive counterfactual statements. This leads us to propose a novel method for explaining predictions of a decision tree with counterfactuals. We show that our model-specific approach exploits all the theoretical advantages of counterfactual explanations, hence improves decision tree interpretability by decoupling the quality of the interpretation from the depth and width of the tree.