Concept drift detection is crucial for many AI systems to ensure the system's reliability. These systems often have to deal with large amounts of data or react in real time. Thus, drift detectors must meet computational requirements or constraints with a comprehensive performance evaluation. However, so far, the focus of developing drift detectors is on detection quality, e.g.~accuracy, but not on computational performance, such as running time. We show that the previous works consider computational performance only as a secondary objective and do not have a benchmark for such evaluation. Hence, we propose a set of metrics that considers both, computational performance and detection quality. Among others, our set of metrics includes the Relative Runtime Overhead RRO to evaluate a drift detector's computational impact on an AI system. This work focuses on unsupervised drift detectors, not being restricted to the availability of labeled data. We measure the computational performance based on the RRO and memory consumption of four available unsupervised drift detectors on five different data sets. The range of the RRO reaches from 1.01 to 20.15. Moreover, we measure state-of-the-art detection quality metrics to discuss our evaluation results and show the necessity of thorough computational performance considerations for drift detectors. Additionally, we highlight and explain requirements for a comprehensive benchmark of drift detectors. Our investigations can also be extended for supervised drift detection.
翻译:概念漂移检测对于保障众多人工智能系统的可靠性至关重要。这些系统通常需要处理大量数据或进行实时响应,因此漂移检测器必须满足计算需求或约束条件,并接受全面的性能评估。然而,目前漂移检测器的开发重点在于检测质量(如准确率),而非计算性能(如运行时间)。我们指出,现有研究仅将计算性能视为次要目标,且缺乏针对此类评估的基准。为此,我们提出了一套兼顾计算性能与检测质量的指标集。其中,相对运行时间开销(RRO)用于评估漂移检测器对AI系统计算性能的影响。本研究聚焦于无监督漂移检测器,其运行不依赖标注数据的可用性。我们基于RRO和内存消耗两个指标,在五个不同数据集上对四种可用的无监督漂移检测器进行了计算性能测量。结果显示,RRO值范围从1.01到20.15。此外,我们测量了当前的检测质量指标,以讨论评估结果,并论证了全面考量漂移检测器计算性能的必要性。同时,我们强调并阐释了构建漂移检测器全面基准测试的要求。本研究的分析方法亦可推广至有监督漂移检测场景。