In the classic machine learning framework, models are trained on historical data and used to predict future values. It is assumed that the data distribution does not change over time (stationarity). However, in real-world scenarios, the data generation process changes over time and the model has to adapt to the new incoming data. This phenomenon is known as concept drift and leads to a decrease in the predictive model's performance. In this study, we propose a new concept drift detection method based on autoregressive models called ADDM. This method can be integrated into any machine learning algorithm from deep neural networks to simple linear regression model. Our results show that this new concept drift detection method outperforms the state-of-the-art drift detection methods, both on synthetic data sets and real-world data sets. Our approach is theoretically guaranteed as well as empirical and effective for the detection of various concept drifts. In addition to the drift detector, we proposed a new method of concept drift adaptation based on the severity of the drift.
翻译:在经典机器学习框架中,模型基于历史数据训练并用于预测未来值,其假设数据分布随时间保持不变(平稳性)。然而,在现实场景中,数据生成过程会随时间变化,模型必须适应新流入的数据。这一现象被称为概念漂移,会导致预测模型性能下降。本研究提出了一种基于自回归模型的新型概念漂移检测方法,名为ADDM。该方法可集成至从深度神经网络到简单线性回归模型的任意机器学习算法中。实验结果表明,在合成数据集和真实数据集上,这种新型概念漂移检测方法的性能均优于最先进的漂移检测方法。我们的方法兼具理论保障与实证有效性,能够检测多种概念漂移。除漂移检测器外,我们还提出了一种基于漂移严重程度的概念漂移自适应新方法。