IREX

Abstract

This paper presents IREX: a reusable method for the Iterative Refinement and EXplanation of classifica- tion models. It has been designed for domain-expert users –without machine learning skills– that need to understand and improve classification models. This way, it only requires the expected classification outcomes given by a domain expert. IREX proposes a smart combination of XAI methods that identifies potential inconsistencies in the model, explaining the causes to the user. Following a cycle analogous to CBR, a set of candidate anomalous variables are identified (retrieved) and proposed for its revision by an expert user. Once revision is confirmed, the model is purged and retrained for further optimization. This is a novel process where explanation methods are not only applied to explore a black-box model, but also to detect which input variables led to misclassifications, proposing and explaining their negative impact to the domain-expert user. We propose an automatic evaluation approach based on computing the number of anomalous input variables that the expert was able to identify and its comparison to the evolution of the classification model’s performance. Then we apply this evaluation method to demonstrate the performance of the proposal on the given dataset. Finally, we provide a reusable implementation that can be directly applied to other classification models and domains.