Recent research yielded a wide array of drift detectors. However, in order to achieve remarkable performance, the true class labels must be available during the drift detection phase. This paper targets at detecting drift when the ground truth is unknown during the detection phase. To that end, we introduce Gaussian Split Detector (GSD) a novel drift detector that works in batch mode. GSD is designed to work when the data follow a normal distribution and makes use of Gaussian mixture models to monitor changes in the decision boundary. The algorithm is designed to handle multi-dimension data streams and to work without the ground truth labels during the inference phase making it pertinent for real world use. In an extensive experimental study on real and synthetic datasets, we evaluate our detector against the state of the art. We show that our detector outperforms the state of the art in detecting real drift and in ignoring virtual drift which is key to avoid false alarms.
翻译:近期研究产生了多种漂移检测器。然而,为获得显著性能,漂移检测阶段必须获得真实类别标签。本文旨在解决检测阶段真实标签未知时的漂移检测问题。为此,我们提出高斯分裂检测器(Gaussian Split Detector, GSD)——一种新型批量式漂移检测器。GSD专为数据服从正态分布的场景设计,利用高斯混合模型监控决策边界的变化。该算法可处理多维数据流,并在推理阶段无需真实标签即可运行,使其适用于实际应用场景。通过对真实数据集与合成数据集的广泛实验研究,我们将所提出的检测器与当前最优方法进行了对比评估。结果表明,我们的检测器在识别真实漂移与忽略虚拟漂移方面均优于现有最优方法,这对于避免漏报具有关键意义。