The notion of concept drift refers to the phenomenon that the distribution generating the observed data changes over time. If drift is present, machine learning models can become inaccurate and need adjustment. While there do exist methods to detect concept drift or to adjust models in the presence of observed drift, the question of explaining drift, i.e., describing the potentially complex and high dimensional change of distribution in a human-understandable fashion, has hardly been considered so far. This problem is of importance since it enables an inspection of the most prominent characteristics of how and where drift manifests itself. Hence, it enables human understanding of the change and it increases acceptance of life-long learning models. In this paper, we present a novel technology characterizing concept drift in terms of the characteristic change of spatial features based on various explanation techniques. To do so, we propose a methodology to reduce the explanation of concept drift to an explanation of models that are trained in a suitable way extracting relevant information regarding the drift. This way a large variety of explanation schemes is available. Thus, a suitable method can be selected for the problem of drift explanation at hand. We outline the potential of this approach and demonstrate its usefulness in several examples.
翻译:概念漂移是指观测数据生成分布在时间上发生变化的现象。若存在漂移,机器学习模型可能变得不准确而需要调整。尽管已有方法可检测概念漂移或在观测到漂移时调整模型,但解释漂移——即以人类可理解的方式描述分布中可能存在的复杂高维变化——这一问题迄今鲜少被探讨。该问题至关重要,因为它能揭示漂移表现形式及发生位置的最显著特征,从而促进人类对变化的理解,并提升终身学习模型的接受度。本文提出一种新技术,基于多种解释方法,通过空间特征的典型变化来表征概念漂移。为此,我们提出一种方法论,将概念漂移的解释简化为对以适当方式训练、能提取漂移相关信息的模型之解释。如此一来,大量解释方案可供使用,从而能为具体的漂移解释问题选择适当方法。本文概述了该方法的潜力,并通过多个实例论证其有效性。